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Executive  Summary 


The  focus  of  this  report  is  on  advanced  tools  for  the  analysis  of  nonlinear 
stochastic  control  and  filtering  systems. 

In  sections  1  and  21  we  present  a  series  of  results  on  the  analysis  of 
certain  classes  of  nonlinear  filtering  problems  using  comparatively  simple 
bounding  techniques.  We  consider  both  problems  with  small  noise  (large 
signal  to  noise  ratios)  and  weakly  nonlinear  systems.  We  show  that  the 
optimal  nonlinear  filters  can  be  well  approximated  by  linear  filters  which 
are  very  easy  to  implement.  Moreover,  we  provide  sharp  estimates  of  the 
degree  of  suboptimality  involved  in  using  the  linear  approximating  filters. 

In  section  32  we  consider  the  problem  of  managing  the  estimation  of  a 
(nonlinear)  diffusion  process  by  a  system  employing  several  sensors.  The 
essential  problem  is  to  “schedule*  the  use  of  the  sensor  to  optimize  the 
estimate  of  a  function  of  the  state  of  the  diffusion  process.  The  solution  is 
optained  in  terms  of  a  system  of  quasi-variational  inequalities  in  the  space 
of  solutions  of  certain  Zakai  equations. 

In  section  4  we  provide  a  new  proof  of  the  minimum  principle  in  stochas¬ 
tic  optimal  control  theory  for  systems  of  partially  observed  diffusions.  In 
section  53  we  provide  a  concise  analysis  of  the  “conditional  adjoint  process" 
arising  in  the  stochastic  minimum  principle  for  partially  observed  diffusion 
processes. 

The  sections  may  be  read  independently. 

‘The  work  in  these  sections  is  joint  work  by  L.  Saydy  and  G.L.  Blankenship. 

’The  work  in  this  section  is  joint  work  by  J.S.  Baras  and  A.  Bensoussan. 

*The  work  in  sections  4  and  5  is  joint  work  by  J.S.  Baras,  R.J.  Elliot  and  M.  Kohlmann. 


1  Optimal  Stationary  Behavior  in  Nonlinear  Fil¬ 
tering  Problems:  A  Bound  Approach 

1.1  Introduction 

We  consider  the  Ito  stochastic  model: 

dxt  -  g(tyzt)dt  +  a(t)dwt  (1) 

dyt  =  h(t,zt)dt  +  p(t)dv, 

x(0)  =  x0;  0  <t<T 

where  g,h,a  and  p  are  smooth  functions  of  their  arguments,  {t>t},{u>t} 
are  independent  Wiener  processes,  zo  a  random  variable  independent  of 
{vt},{w,}. 

Given  this  model  one  is  interested  in  computing  least  squares  estimates 
of  functions  of  the  signal  xt  given  <r{y«,  0  <  s  <  t},  the  o-algebra  generated 
by  the  observations,  i.e.  quantities  of  the  form  J£[^(x<)|<r{y,,0  <  s  <  t}]. 
In  many  applications  this  computation  must  be  done  recursively.  This  in¬ 
volves  the  conditional  probability  density  p*(t,x)  which  satisfies  a  nonlin¬ 
ear  stochastic  partial  differential  equation,  the  Kushner-Stratonovich  equa¬ 
tion  [16].  By  considering  an  unnormalized  version  of  pw,  the  above  problem 
can  be  reduced  to  the  study  of  the  Duncan-Mortenson-Zakai  (DMZ)  equa¬ 
tion  which  is  linear  ([2]). 

The  filtering  problem  was  completely  solved  in  the  context  of  finite  di¬ 
mensional  linear  Gaussian  systems  by  Kalman  and  Bucy  [17,18]  in  1960-61, 
and  the  resulting  Kalman  filter  (KF)  has  been  widely  applied.  Apart  from 
a  few  special  cases  [3,23],  the  nonlinear  case  is  far  more  complicated;  the 
evolution  of  the  conditional  statistics  is,  in  general,  an  infinite  dimensional 
system. 

Although  progress  has  been  made  using  the  DMZ  equation,  optimal 
algorithms  are  not  generally  available.  Suboptimal  filters  are  thus  of  interest. 
The  performance  of  suboptimal  designs,  however  derived,  may  be  baaed  on 
lower  and  upper  bounds  on  the  minimum  mean  square  error  (optimal  MS- 
error)  p(t).  Thu  approach  is  used  here  to  investigate  the  asymptotic  behavior 
of  a  class  of  nonlinear  filtering  problems. 
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Two  aspects  are  treated  in  detail: 

1.  the  long  time  behavior,  that  is,  the  asymptotic  behavior  of  the  filter 
as  t  —*  oo  (this  section;  see  also  the  paper  [21]). 

2.  the  asymptotic  behavior  as  c  — *  0,  with  c  a  small  parameter  in  the 
model  (in  the  next  section  2;  see  also  the  paper  [22]). 


To  illustrate  the  ideas,  consider  the  one-dimensional  version  of  the  model 
where  g  and  h  have  continuous  bounded  derivatives,  say 

&{t)  <  9z{t,x)  <  6(t) 

£(0<M<,  *)<£(*) 

and  let 

p[t)  :=  E[xt-E[xt  |l/*)]2 
p*(t)  :=  E(xt-x; )2 

where  yl  =  <r{y,,0  <  a  <  t)  and  x\  is  given  by: 

dx\ r  =  g(t,x*)dt  +  ^u(t){dyt  -  h(t,xl)dt}-,  x*(0)  =  0  (5) 

u(t)  =  +  2a(t)u(t)  -  «J(t);  «(0)  =  cl 

(io  ~  ^(0,ffo)  assumed) 

Clearly  the  BOF  (bound  optimal  filter)  (5)  is  readily  implementable,  with 
precomputable  gain.  It  coincides  with  the  Kalman  filter  if  g  and  h  are  linear. 
In  section  1.2  it  is  shown  by  applying  results  from  [7,13]  that  the  BOF  is 
a  “best  bound”  filter  in  the  sense  that  the  associated  upper  bound  u[t)  of 
p*(t)  is  the  tightest  over  a  class  of  nonlinear  Kalman-like  filters  and  that 
p[t)  is  bounded  as  follows: 

0  <  t{t)  <  p(t)  <  p*(t)  <  u(t) 

where  l(t)  satisfies  another  Riccati  equation. 


(2) 

(3) 

(4) 
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In  section  1.3  these  bounds  are  used  to  address  the  long  time  behavior 
of  asymptotically  time  invariant  systems.  In  the  particular  case  where 

g(t,x)  =  ax  +  A (t)f(t,x)  — »  ax  as  t  — *  oo 


and 


h(t ,  x)  =  cx+  v{t)k(t ,  x)  — »  cx  as  t  — *  oo 


it  is  shown  that  the  BOF  is  asymptotically  optimal  in  the  sense  that 


lim  (p*(t)  -  p(t))  =  0 
*—♦00 

and  that  as  far  as  the  long  time  performance  is  concerned,  the  nonlinearities 
/  and  k  can  be  ignored  in  the  original  model.  In  other  words  the  “KF"  and 
even  the  “SSKF”  (steady  state)  formally  designed  for  the  underlying  linear 
system  are  asymptotically  optimal. 


In  section  1.4  examples  with  simulation  results  are  given. 


1.2  Lower  and  Upper  bounds  on  the  a  priori  Optimal  MS- 
Error 


Since  the  explicit  solution  of  nonlinear  filtering  problems  is  impossible  in 
general,  one  is  naturally  interested  in  suboptimal  solutions,  the  performance 
of  which  rnay  be  avaluated  using  upper  and  lower  bounds  on  the  (unknown) 
optimal  MS-error.  In  fact,  the  structural  complexity  which  arises  is  also 
present  at  the  level  of  performance  testing  in  the  sense  that  simple  and 
tractable  bounds  are  not  generally  available  for  suboptimal  estimators  unless 
one  puts  further  restrictions  on  the  type  of  nonlinearities  considered. 


Consider  the  one  dimensional  Ito  stochastic  differential  equation 


dxj  =  g(t,xt)dt  +  a[t)dwt 

dyt  =  h(t,  xt)dt  +  p(t)dv t  (6) 

x(0)  =  x0;  0  <  t  <  T 


*0  ~  Po(*)i  Exo  =  0,  Exqx\)  =  oI 

where  {«>(}  and  {v<}  are  independent  standard  Wiener  processes,  xo  is  a 
random  variable  (generally  taken  to  be  Gaussian)  independent  of  {u;*}  and 
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{v(};  g  and  A  are  such  that  (6)  has  a  unique  solution  [1],  differentiable  with 
continuous  partial  derivatives.  Given  this  model  one  is  interested  in  finding 
bounds  on  the  optimal  MS-error: 


p(t)  =  £[(x(-£(rt|]/»))2]  (7) 

where  J/q  =  <r{y,,0  <  »  <  /}  is  the  <r-algebra  generated  by  the  observations 
up  to  time  t;  i.e.,  find  functions  l(t),u(t)  such  that: 

0  <  f(t)  <  p(t)  <  u(t)  (8) 


In  this  section,  existing  results  are  applied  to  one  dimensional  systems 
for  which  the  nonlinearities  have  bounded  derivatives  to  obtain  lower  and 
upper  bounds  involving  ordinary  differential  equations  of  the  Riccati  type. 
The  upper  bound  is  obtained  in  subsection  1.2.2  by  considering  a  class  of 
nonlinear,  Kalman-like  suboptimal  filters.  To  each  such  filter  is  associated 
an  upper  bound  on  the  corresponding  mean  square  error  (MSE)  and  the 
BOF  (bound  optimal  filter)  is  defined  as  the  one  with  the  tightest  upper 
bound.  The  latter  is  used  in  inequality  (8). 


1.2.1  Lower  bound 

The  following  additional  assumptions  make  it  possible  to  derive  a  simple, 
tractable  lower  bound  in  the  one  dimensional  case: 


Hi  •  |ffx(t,x)  -  <*(01  <  Aa(0 

Hi  :  |Ax(t,x)  -  0(t)|  <  A0(O,0(O  :=  0(f)  -  A0(O  >  0 

We  will  denote  this  by: 

g  [a(0,  Aa(t)] 

H  €■<  [0(0,  A0(O) 

Remark:  The  symbol  A  serves  to  exhibit  the  fact  that  Aa  is  a  slope 
departure  function. 


4 


Proposition  2-1: 

Assume  Mi,  Mi  hold  and  let  p(f)  :=  E(xt  -  £(*t|i/o))2>  ^en  p(0  *®  lo^er 
bounded  by  l(t),  i.e.,  0  <  /(f)  <  p(t)  where  /(f)  satisfies  the  following  Riccati 
equation: 

t(t)  =  <r2(f)  +  2a(f)/(f)  -  ^ [£2(f)  +  4^(Aa(f))2]/2(f)  (9) 

£(o)  =  °l 

with  the  notation:  6  =  a  +  A  a,  a  —  a  —  Aq. 

Remark:  The  above  proposition  says  that  the  optimal  MS-error  p(f) 
corresponding  to  the  nonlinear  filtering  problem  (6)  is  lower  bounded  by  the 
optimal  MS-error  corresponding  to  the  following  Kalman  filtering  problem: 


dzt  —  o(f)2»df  +  o(t)dwt 
dy[  =  P(t)*tdt  +  p'{t)dvt 


It  is  easily  seen  (e.g.,  [16])  that: 

E\zt  -  E(z,|«r{yi  :  0  <  s  <  f})]2  =  /(f) 


Proof: 

Using  the  Bobrovsky-Zakai  lower  bound  [7]  we  get  that  /(f)  <  p(f)  where 
/(f)  =  *2(f)  +  2a(f)/(f)  -  ^L2(f),  L(0)  =  cl 

a[t)  =  Egx(t,  X();  c2(t)  =  Eh2z[t,  x,)  +  ~^var(yx(f,  x,)) 

°  V) 

Thus,  /(f)  satisfies  a  Riccati  equation,  the  coefficients  of  which  are  unknown 
in  general. 

Clearly  Mi  implies:  a(t)  <  gt  <  6(f)  a.s.,  and  hence  ,  o(f)  <  o(f)  <  6(f). 
Thus, 

!$*(*.  x«)  ~  °(0I  <  2Ao(f)  a.s. 
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and 


var gz{t,xt)  <  4(Aa(t))2 

Similarly  X2  implies:  0  <  fi(t)  <  hz(t,xt)  <  fi(t)  hence  Eh\(t,xt)  <  /32(t)- 
Therefore: 

c2(t)<£2(t)  +  4^(Aa(<))2 

Since  L(t)  satisfies  a  Riccati  equation  with  strictly  positive  initial  condition, 
then  L{t)  >  0  [llj  and  the  right  hand  side  of  t(t)  is  hence  greater  than 


o2(t)  +  2a (t)i(t)  -  -~ 


P(t)+4 


At) 

At) 


(Ao(t))2 


L1 


By  the  comparison  theorem  (see  appendix*)  we  obtain:  £{t )  <  L(t) 


1.2.2  Upper  bound  and  bound  optimal  filter  (BOF) 

Let  z,  and  yt  be  as  in  (1)  and  assume  that 

Mi  :gz(t,x)  is  continuous  and  gz{t,x)  <  a[t) 

M2  '•  hz(t,  x)  is  continuous  and  hz(t,X)>m>o 

Proposition  2-2 : 

The  optimal  MS-error  p(t)  is  upper  bounded  by  u(t)  where  u(t)  satisfies 
the  Riccati  equation: 

ii(t)  =  <r2(t)  +  2a(t)u(t)  -  =~u2(t)  (10) 

u(0)  =  CTq 

Remark:  This  says  that  the  optimal  MS-error  in  the  nonlinear  filtering 
problem  (1)  is  upper  bounded  by  the  optimal  MS-error  in  the  following 
linear  one: 

dzt  =  a(t)ztdt  +  o{i)dwt 
dy't  =  @{t)*tdt  +  p(t)dvt 

4The  Appendix  follow*  Section  2. 


6 


Proof: 


f 


I 

I 

I 

I 

I 

I 


I 

I 

I 

I 

I 

I 

I 


The  conditional  mean  xt  :=  E(xt\yi)  and  the  conditional  optimal  MS- 
error 

pt  :=  E[{xt  -  it)2|]/o] 

are  given  by  [16]: 


dx,  =  g(t,  xt)dt  +  ~ ^-dwt ;  x0  =  0 

P  {*) 

dpt  =  [o2(t)  +  2{(xtgt)  -  xtgt)  -  -^^{et)2}dt  + 

A  2 

Po  =  Oq 

where  (•)  denotes  conditional  expectation  and 


(11) 


9t  =g{t,xt);ht  =  h{t,xt) 
et  =  ( xtht)hat  -  xtht 
Tt  =  (zpit)  -  x}ht  -  2xt[xtht)  +  2(xt)2kt 

and  dwt  '=  dyt  -  h(t,  xt)dt  is  the  innovation  process  which  is  a  Wiener 
process  on  1/q. 

Since  the  expectation  of  Ito  integrals  is  zero  and  Ept  =  E(xt  —  it)2  =  p(t), 
by  taking  the  expectation  on  both  sides  of  (11)  we  get 

HO  =  *2(0  +  2 E[{xa,)  -  xtgt)  ~  »  p(0)  = 

The  smoothing  property  of  conditional  expectations  [1]  implies 


E{{xtgt)  ~  xtgt )  =  E[xt  -  xt){gt  -  g{t,xt)) 
=  Ext(gt  -  g(t,x ,)) 


Therefore, 


HO  =  +  2 Ext{gt  -  g{t,xt))  -  p(0)  =  a2 


(12) 
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Jensen’s  inequality  [1]  implies  that: 

E(et)2  >  ( Eit )2 

Ect  =  E((xtht)  -  xtht)  —  Ext{ht  -  h[t,xt)) 

now 

h(t,  x<)  —  =  x<  /  h,[t,Xf  +  ax(]ds  :=  x,V>h 

Jo 

Hence, 

£et  =  £x2t/>>, 

Mi  implies  that  t&j,  >  £(t)  a-s. 

£e«  >  £(t)£x2  =  £(t)p(t)  (13) 

E('t )2  >  (^et)2  >  £2(t)p2(t) 

Similarly  J/j  implies  that 

Ext(gt  -  g(t,xt ))  =  <  6(0^*?  =  «Wp(0  (14) 

Combining  (12)-(14)  and  using  the  comparison  theorem  in  the  appendix 
yields:  p(t)  <  u(t). 


QED 


An  alternate  and  more  constructive  approach  to  getting  the  same  result, 
due  to  A.S.  Gilman  and  I.B.  Rhodes  [13],  is  outlined  below.  The  upper 
bound  is  derived  by  considering  the  following  family  of  parametrized  non¬ 
linear  suboptimal  filters,  the  structure  of  which  is  suggested  by  the  Kalman 
filter: 

dxj**  =  g(t,  x\ k))dt  +  fc(t)[dy<  -  h{t,  x^)dt],  xj,**  =  0 

where  fc(t)  is  a  non  random,  continuous  non-negative  bounded  function. 

To  each  gain  Jfc(t)  is  associated  a  suboptimal  filter  given  by  (15)  and 
denoted  {x}*.  It  can  be  shown  ([13,20])  that: 

1.  Corresponding  to  each  {z}*  there  exists  a  function  u*(t)  satisfying  the 
linear  ODE: 

«*(0  =  <Tl(0+p?(0*J{0+2[6(t)-fcW^W]«*(0;  “*(0)  =  *0  (16) 
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such  that 


pk{t)  :=  E(xt  -  if)2  <  uk{t) 


(17) 


2.  The  suboptimal  filter  {i}*.  obtained  for  the  particular  choice  k*(t)  = 
»-e-» 


dx\  =  g{t,  x ;)dt  +  l£L(t)[dy(  -  M*.  *?)*];  *5  =  0  (18) 

where  u(t)  satisfies  the  Riccati  equation: 

4(0  =  «*(0  +  2fi(0«(0  -  ^y«S(0;  tx(O)  =  al  (19) 

is  such  that  u[t)  <  Uk[t)  for  every  continuous  nonnegative  function 
fc(t).  More  importantly,  we  have  the  following  inequalities: 

p(t)  :=  E(xt  -  E(xt\y'0))*  <  P'(t)  :=  E{ xt  -  if)2  <  u(t)  (20) 


The  nonlinear  filter  given  by  ( 18)-(19 ),  subsequently  refered  to  as  the  bound 
optimal  filter  (BOF),  will  turn  out  to  be  near  optimal  in  many  situations  of 
practical  importance  as  will  be  seen  in  the  next  subsection  and  in  [2l] . 


1.2.3  Summary 

For  systems  modeled  by  one  dimensional  Ito  SDE’s  of  the  form: 

dx,  =  g(t,xt)dt  +  o(t)dwt  (21) 

dyt  =  h(t,  xt)dt  +  p{t)dvt  (22) 

Exq  =  0,  Ex\  =  «tq 

with  g  and  h  satisfying 

Mi  •  1 9x{t,z)  ~  a(0l  ■<  Aa(t)  denoted  by  g  €-<  [a(t),  Aa(t)]  (23) 

Mi  :  |h*(t,x)  -  0(t)|  -<  A 0(t)  denoted  by  h  e-<  [£(t),  A/9(t)]  (24) 

define 

fi(t)  :=  a(t)  +  Aa(t);  &(t)  :=  a (t)  -  Aa(t)  (25) 
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~m  ■■=  m + a/?(0;  m  .*=  m  -  >  o  (26) 

p(t)  :=  E(xt  -  E(xt \y*))2  (27) 

p*(t)  :=  E(xt  -  x*t)2  (28) 

where  if  is  the  BOF  and  is  given  by 

d*t  =  9{t,  x*i)dt  +  ^j^u(t)[<fy,  -  h(t ,  x?)dt];  *5  =  0  (29) 

P  K1) 

u(t)  =  a2(t)  +  2q(*)u(0  -  i^u2(t);  u( 0)  =  (30) 

Then  by  combining  the  results  from  the  previous  two  sections  we  readily 
get  the  following  bounds  on  the  optimal  MS-error: 

0  <  l(t)  <  p{t)  <  p*(t)  <  u(t)  (31) 

where 

i(i)  =  «,(i)  +  2a(iK(0-^ijl3I(0  +  4^(ao(0)1]«,«)  (32) 

m  =  cl 

and  u(t)  satisfies  (30). 


1.3  Asymptotically  Linear  Systems 

In  this  section  we  discuss  systems  that  are  asymptotically  time  invariant, 
i.e., 

dx  |  =  g(t,xt)dt  +  cdwt  (33) 

dyt  =  h(t,  xt)dt  +  pdvt  (34) 

where 

9(t>x)  =  g(x)+  \(t)f(t,x) 
h(t,x)  =  h(x)  +  v(t)k(t,x) 


0Gx[a,Aa]  ;  /  €-:  [p(t),  Ap(t)] 
€-<  [c,Ac]  ;  k€<  [f(t),Aj(t)] 


and 

4Bm[A(0,^(0]  =  [O,O]  (35) 

»— *oo 

In  the  particular  case  where  g(x )  and  h(x)  are  linear  (the  limiting  system 
is  linear),  one  is  interested  in  knowing  whether  the  Kalman  filter  (KF)  de¬ 
signed  formally  for  the  limiting  linear  system  and  driven  by  (the  nonlinear 
observations)  yt  in  (33)  is  asymptotically  optimal  as  t  becomes  large.  This 
situation  arises  for  example  when  the  nonlinarities  are  neglected  during  the 
modelization  process.  Using  an  abuse  of  terminology,  the  nonlinear  filter 
resulting  from  the  scheme  just  described  will  be  (wrongly)  called  to  as  the 
“KF." 

More  specifically,  let 

dxt  =  axtdt  +  \(t)f(t,xt)dt  +  cdwt  (36) 

dyt  =  cxtdt  +  v(t)k(t,  x t)dt  +  pdvt  (37) 

Ex o  =  0,  Exq  —  Oq  >  0 

Then  the  “KF”  designed  for  the  limiting  system  is 

dxf  =  aifdt  +  ^r{t)[dyt  -  cxfdt];  x*(0)  =  0  (38) 

f(t)  =  <r2  +  2ar  -  ^r2;  r(0)  =  ol  (39) 

P 

and  the  questions  of  interest  are: 

•  Under  what  conditions  is  x*  (or  the  BOF  xj)  asymptotically  optimal 
as  t  — *  oo,  i.e.  lim<.<0o(pfc(t)  -  p(t))  =  0  (limt_0O(p*(t)  -  p(t))  =  0)7 
where 

p*(t)  =  E{xt  -  x*)2  (40) 

p*(t)  =  E{xt-x*)i  (41) 

p{t)  =  E{x,-E{xt\yi0))i  (42) 

•  Would  the  same  result  hold  for  the  steady  state  “KF"  (“SSKF”),  ol>- 
tained  by  setting  r(t)  =  r(oo)  in  (38)7 
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The  bounds  on  the  optimal  MS-error  derived  in  the  previous  section 
are  used  to  answer  these  questions  in  the  linear  limiting  case.  However, 
the  bounds  on  the  partial  derivatives  of  the  nonlinearities  do  not  contain 
“enough  information”  to  treat  similar  questions  in  the  general  case  where 
g(x)  and  h(x)  are  nonlinear. 

Consequently,  we  will  only  consider  the  class  of  nonlinear  filtering  prob¬ 
lems  (36)  with  the  assumptions: 


#1  =  /  €-<  (p(t),  Ap(t)];  k  €■<  [f  (t),  Af  (t)] 

#2  :  A(t)  and  i/(t)  are  continuous,  vanishing  functions  on  [0,  oo[  and  non¬ 
negative  for  simplicity 

M3  :  /i(t),  Ap(t),  f  (t)  and  A f(t)  are  bounded  continuous  functions  on  [0,  oo[. 
M+  :  c  +  1 '(t)f(t)  >  60  >  0;  c  ^  0. 


In  the  next  two  subsections  we  show  that: 

Hm  (p*(t)  -  p(t))  =  0  and  Um  (p*(t)  -  p(t))  =  0  (43) 

this  is  done  by  bounding  p(i)  as 

0  <  l{t)  <  p(t)  <  p‘(t)  <  u(t)  (44) 

0  <  l(t)  <  p(t)  <  pk(t)  <  q{t)  (45) 

and  showing  that 

lim  (u(t)  -  £(t))  =  0  and  lim  (g(t)  -  t(t))  =  0  (46) 

t—*  OO  t—*  OO 

The  result  is  then  generalized  to  the  case 

n 

g(t,z)  =  oz  +  £  MO/, ■(«,*)  (47) 

1=1 

m 

h(t,  z)  =  cz  +  ]T  1 z)  (48) 

t=i 


which  in  turn  can  be  applied  to  treat  cases  where  a  and  c  are  time  varying 
functions. 
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l.S.l  Asymptotic  optimality  of  the  BOF 

In  the  case  of  (36),  we  note  that  and  Mi  imply: 

g(t,  x)  =  ax  +  A(t)/(t,  x)  e*  [a  +  A (t)p(t),  A(t)A/i(t)]  (49) 

h(t,  x)  =  cx+  v(t)k[t,  x)  €-<  [c  +  »/(t)f  (t),  i^(t)  Af  (t)]  (50) 

Thus  the  results  in  section  1.2.3  apply  with 

6  =  o  +  A  (t)p(t)  +  A(t)Ap(t)  =  a  A(t)p(t)  (51) 

a  =  a  +  A(t)p(t)  (52) 

£  =  c  +  r(t)((t)  (53) 

0  =  c  +  ^(t)f(t)  (54) 

and  the  BOF  is  given  here  by: 

dx  7  =  aifdt  +  A(t)/(t,  x*)dt  +  =^u(t)[dyt  -  cx^dt  -  i/(t)k(t,  xf)dt]  (55) 

P 

x*0  =  0 

ii(t)  =  a2  +  2fiu(t)  -  u2(t);  u(0)  =  <7q  (56) 


The  asymptotic  optimality  of  the  BOF  is  a  direct  consequence  of  the 
following  Lemma: 

Lemma  3-1:  Let  0\, Oil'll  ond  72  be  continuous  Junctions  on  [0, 4-oo[  such 
that 

lim  $i(t)  =  a 
t — *00 


^Hm  7f(t)  =  c2;t  >0;i  =  1,2 


and  consider  the  Riccati  equations: 


vi  =  <r2  +  2ffivi-^vj;  Vi(0)  =  eZ 
v2  =  o,  +  207V2  -  '~v\\  vj(0)  =  a\ 


I 

I 
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If  vi(t)  >  v2 (t)  and  if  one  of  the  assumptions  given  below  holds  then: 

lim  t n(t)  =  15m  vj(t) 

l"-*00  *—♦00 


Assumptions: 


Ai  :  a  <  0 

A2  :  v2 (t)  >  r(t),  t  >  0  and  >  62  >  0  for  some  6 


Recall  that: 

r(t)  =  o2 +  2ar  -  ^r2,  r(0)  =  el  (59) 

Proof: 

Let  w(t)  =  vj  (t)  -  V2 (t)  >  0.  Then  a  straightforward  computation  yields 


w  =  2(0i  -  $2)v2  +  -^(7*  “  ll)vl  +  2(^i  -  ^t>2)u>  -  ~~r  vj2  (60) 

p‘  P  P 


w(0)  =  0 

which  we  rewrite  as 

~2 

*(0  =  i(t)  +  2 j(t)w  -  —u.'2,  u»(0)  =  0 

P 

(61) 

where 

•W  =  2i9i  ~  °i)v2  +  ~  7i)v2 

(62) 

N 

H  <». 

I 

H 

<*> 

II 

(63) 

Equation  (61)  clearly  implies: 

w  <  i(t)  +  2 j(t)w. 

(64) 

Depending  on  the  assumption  used  (Aj  or  A2)  we  will  bound  w(t)  dif¬ 
ferently  using  the  comparison  theorem. 
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Assumption  A\: 

Since  l(t)  and  w(t)  are  nonnegative,  ui(t)  can  be  bounded  as 

vi  <  »(l)  +  2$iw  (65) 

thus  0  <  w(t)  <  z(t)  where 

z(t)  =  i{t)  +  20^-,  *(0)  =  0  (66) 

Similarly  vi(t)  <  Vj(t)  where 

Vj  =  <r2  +  291VU  Vj(0)  =  <rJ  (67) 

If  a  <  0  then  limj-,*,  9\  =  a  <  0  and  Perron’s  theorem  (see  the  appendix) 
can  be  applied  to  (66)  and  (67).  We  get 

Vi(°°)  =  -^-.  (68) 

Since  v2  (*)  <  vx(t)  <  Vx(t)  for  every  t  >  0,  (68)  implies 

lim  »'(t)  =  0 
<—♦00 

Re-applied  to  (66)  Perron’s  theorem  yields 

lim  z[t)  ~  0  that  is  lim  w(t)  =  0 

<—oo  v  '  <—oo  v  ' 

Assumption  A2: 

Since  v2(t)  >  r(t),j(t)  <  0\  -  ^r(t),  (64)  then  implies  that  w(t)  <  z(t), 
where: 

z  =  <(()  +  2(0!  -  2lr(t))z(t);  *(0)  =  0  (69) 

P 

Umt-,oo(0i  -  -ftr(t))  =  o  —  ^yf(oo);  but  r(oo)  is  the  positive  root  of 

„  ca 

a2  4-  2ox - ? x2  =  0  (70) 

pi 

i.e., 

r(oo)  =  ^|0+(o>  +  ^e’),/>] 
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and 

—  £r(oo) 

Thus  lim<_oo  *(*)  =  0  provided  limt_oo*(t)  =  0.  For  this  to  happen  it 
suffices  that  v2{t)  be  bounded  (t>i(t)  be  bounded).  Using  the  assumptions 
and  the  comparison  theorem,  we  immediately  get  vj(t)  <  Vj(t)  where 

Vi  =  *a  +  2 OmV,  -  Vj(0)  =  *1  (71) 

P 

and  0\f  is  a  nonzero  upper  bound  of  0i(t).  Vi(t)  is  clearly  bounded.  We 
conclude  that  lim(_oo  r(t)  =  0,  i.e.,  lim(_oo  w(t)  =  0. 

Note:  We  can  conclude  that  vi(oo)  =  v2[oo)  =  r(oo)  provided  one  of 
the  following  holds: 

1.  Wl(t)  >  v2{t)  >  r(t);  t  >  0 

2.  vi(t)  >  r(t)  >  V2W;  and  a  <  0 

3.  r(t)  >  «i(t)  >  V2(t)  and  a  <  0. 

This  last  assertion  is  obtained  by  applying  the  above  Lemma  to  the  pair 
(r>V2)- 

Proposition  3-2:  If  M\  -  ^4  and  1/$  or  Me  hold,  where 
Ms’,  a  <0 

Me’.  t{t)  >  r(t),  t  >  0 

then  the  BOF  given  by  (55)-(56)  is  asymptotically  optimal  as  t  — »  00. 
Proof: 

We  have  that:  0  <  l{t)  <  p{t)  <  p*(t )  <  t i(t)  where  t(t)  and  u(f)  are 
given  by  (51)  (52)  (56)  and  (58)  in  section  1.2.3.  Lemma  3-1  can  then  be 
applied  to  u(t)  and  l{t)  by  taking: 

*i(0  =  *(0  =  a  +  A(0AW  (72) 
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(73) 

(74) 


*2(0  =  «(0  =  a  +  M0/f(0 
^(0  =  £J(0  =  [c  +  K0i(0]2 

7l(0  =  *,W  +  44(A«W)a 

<T 

'll  =  {c  +  *'(0f(0)2  +  4^A2(t)(Ap(t))2  (75) 

c 

It  is  readily  checked  that  all  hypotheses  in  Lemma  3-1  are  satisfied  and  the 
result  follows. 


Remarks: 


(1)  It  follows  directly  from  Lemma  3-1  that  if  I/s  is  replaced  by  l/£  : 
a  <  0  and  either  u(0  >  r(0  or  u(0  <  r(t)  then  p(oo)  =  p*(oo)  =  r(oo)  = 
£{a  +  (a'  +  $c>y/>\ 

(2)  A  sufficient  condition  for  I/s  to  hold  is  p  >  0  and  (1  +  vf/c)2  + 

4A2  ^  -  <  1  for  every  t  >  0. 


Assuming  that  c  >  0  and  rewriting  the  last  inequality  as 


2p-  + i/2^  +4A2 
c  c * 


P 2  (Ap)2 

/r^  /L 


<c, 


it  can  be  seen  that  a  necessary  condition  for  this  last  inequality  to  hold  is 
j  <  0.  It  turns  out  that  I/s  holds  in  many  cases  if  /  and  k  lie  in  the  first /third 
quadrant  (p  >  0)  and  second/fourth  quadrant  (f  <  0)  respectively  (e.g.  see 
Example  (2)  below). 


In  general  hypotheses  such  as  I/s  should  be  checked  numerically. 


Next  we  generalize  Proposition  3-2  to  nonlinearities  of  the  following  type: 

g[t,x)  =  ax+^Xi{t)fi{t,z)  (76) 

t=l 

m 

h[t,x)  =  cz  +  £i/,  (*)*,(*>  *)  (77) 

with  the  assumptions  l/i,!^  and  I/3  holding  for  each  1  =  1 , . . . ,  n;  y  = 


Using  a  vector  notation,  e.g.,  Ap  =  (Api, . . A/i„)r  »  *&d  <•,•>„  to 
denote  the  inner  product  in  R ",  the  nonlinearities  above  can  be  written  in 


the  more  condensed  form: 

g{t,x)  =  az+  <  X(t),  f(t,x)  >n 

(78) 

h{t,  x)  =  cx+ <  v(t),k(t,x)  >m 

(79) 

and  we  clearly  have 

g  €  -<  ja+  <  A,p  >n;  <  A,Ap  >„) 

(80) 

h  £  -<  [c-S-  <  x/,f  >m;  <  v,  Af  >m] 

(81) 

Thus,  if  we  make  the  additional  hypothesis  H±  :  £  =  c+  <  i ',0>m>  So  >  0 
then  the  same  results  hold.  More  precisely  the  BOF  is  given  by 


dx\  =  ax*tdt+  <  A(t),  f{t, x*t)  >  dt  (82) 

+  ^^u2(t)[dy<  -  cx*dt-  <  u{t),k{t,x*)  >  A];  i*(0)  =  0 

P 

with  the  corresponding  MSE  p*(t)  and  the  optimal  MS-error  p(t)  satisfying 


0  <  e{t)  <  p(t)  <  p*(t)  <  u(t)  (83) 

where  l(t)  and  u(t)  are  given  in  section  1.2.3  with 

a(t)  =  a+  <  A(t),£(t)  >  (84) 

cr(t)  =  a+  <  A(t),p(t)  >  (85) 

Aa=<A(t),  Ap>  (86) 

jj{t)  =  c+  <  u(t),  j(t)  >  (87) 

P(t)  =e+  <u(t),L{t)  >  (88) 


The  following  corollary  is  now  a  direct  application  of  Lemma  3-1. 

Corollary  3-3:  If  Hi  -  H+  and  Hi  or  H $  stated  below  are  aatisfied,  then  the 
BOF  (82)  ie  asymptotically  optimal  as  t  — *  oo. 

Hi  :  a  <  0 
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Me  :  l(t)  >  r(t);  r(t)  given  by  (59). 


The  corollary  can  be  used  to  treat  the  more  general  cases  where  a  and  c 
are  time  varying,  i.e., 

g(t,x)  =  a(t)i  +  ^A,(t)/,(t,z)  (89) 

l 

h(t,x)  =  c(t)x  +  '£2vi(t)ki(t,x)  (90) 

l 

where  lim<_,00  a(t)  =  a  and  lim*— oo  c{t)  =  c 

As  an  illustration,  assume  that  a[t )  and  c(t)  are  monotone  and  continu¬ 
ous,  then  (89)  may  be  rewritten  as 

g{t,x)  =  ax  +  (a(t)  -  a)x+ <  X,  f  >n  (91) 

h{t,x)  =  cz  +  (c(t)  -  c)z+ <  j/,  *  >m  (92) 

By  letting: 

■^n+l(0  =  l°(0  —  °l 
=  |c(t)-c| 

fn+  i{t,x)  =  Sign  (a(t)  -  a)z 
*m+1(t,z)  =  sign  (c(t)  -  c)x 

Equation  (91)  becomes: 

S(t,z)  =  oi+  <  A,  /  >n+i  (93) 

h{t,x)  =  cz+ <  i/,fc>m+1  (94) 

and  we  are  in  position  to  apply  the  Corollary  since  An+i  and  vm+i  are 
continuous  vanishing  nonnegative  functions  with  /n+ 1  and  Jfcm+i  belonging 
to  <  [  sign  (An+i),6]  and  <  [  sign  (i/m+i),6j  respectively,  where  6  >  0  is 
arbitrary. 

1.3.2  Asymptotic  optimality  of  the  KF 

For  the  nonlinear  filtering  problem  (36),  it  is  clear  that  (38)-(39)  correspond 
to  a  regular  Kalman  filter  designed  for  the  underlying  linear  system  obtained 
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when  one  ignores  the  nonlinear  terms  in  (36).  It  should  be  noted  however 
that  (38)-(39)  is  driven  by  observations  from  a  nonlinear  system.  We  will, 
nevertheless,  continue  to  refer  to  it  as  the  “KF”  and  “SSKF*  (steady  state) 
when  r(t)  is  replaced  by  r(oo). 

In  addition  to  we  make  the  following  assumption: 

#0  :  f(t,  0)  and  k(t,  0)  are  continuous,  bounded  on  [0, 00 [. 

Proposition  S-4:  If  a  <  0  then  both  the  *KF”  and  the  * SSKF "  are 
asymptotically  optimal  as  t  —*  00.  Moreover: 

p(oo)  =  pk(o o)  =  r(oo)  =  ^{a  +  (a2  +  ^c2)1/2  ]  (95) 

c4  p* 

Proof:  We  first  derive  an  upper  bound  on  pk(t)  :=  E(xt  —  xf)2  where 


xf  is  given  by  (38)-(39). 

Let  xt  —  xt  -  xf;  then  (36)  and  (38)  yield 

dxt  =  [<7«  -  G(t)/itJeft  +  <rdwt  -  pG(t)dvt  (96) 

where 

C(t)  =  ~2r(t)  (or  ~r{ 00))  (97) 

Si  =  axt  +  A(t)/(t,x,)  (98) 

ht  =  cx(  +  1 '{t)k{t,  x,)  (99) 

Applying  Ito’s  chain  rule  gives 

dif  =  [<r2  +  p2Gi(t)]dt  +  2xtdx(  (100) 

Taking  the  expectation  on  both  sides  yields: 

jtEi ?  =  pfc(t)  =  tr2  +  p2G2(t)  +  2 Ext{gt  -  G(t)^]  (101) 


— rr  —  o1  +  p*G2  +  2  EitSt  —  2GEtihf,  pk(0 )  =  oq 
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pk  =  a2  +  p7G2  +  2 (a  -  cG)pk  +  2A Extf[t,  xt)  -  2 i'GExtk(t,  x) 
Clearly, 

2 Extf{t,xt)  <  Ex]  +  Ef2(t, xt)  -  pk{t)  +  Ef2(t,xt ) 

—  2Extk(t,  xt)  <  pk{t)  +  Ek2(t,xt) 

By  the  comparison  theorem  :  pk(t)  <  ?(0;  ?(°)  =  where 

9(t)  =  a1  +  p2G2  +  2 (a  -  cG)q  +  X{q  +  Ef2)  +  uG{q  -j-  Ek 2) 

=  a2  +  p2G2  +  A Ef2  +  vGEk 2  +  [2(a  -  cG)  +  A  +  vG}q 
which  we  rewrite  as 

9  =  i(0  +  j{t)q,  9(0)  =  tr£  (102) 


Now 


Thus,  if 


lim  j(0  =  2 (a  -  ~r(cx>))  =  -2 (a2  +  —  c2)^2  <  0. 

t— »oo  p*  p* 


lim  X(t)E f2(t,xt)  =  lim  v(t)Ek2(t,xt)  =  0 

^—♦OO  f—*00 

then  limt_oo»(0  =  a2  +  ^r7(oo). 

Applying  Perron’s  theorem  to  (102)  would  give: 

,  .  1(00)  *2+0r2(°°) 

*  °°  =  =  "2(o  -  0r(oo')) 

But  r(oo)  satisfies  the  algebraic  Riccati  equation: 


a 2  +  2ar(oo)  -  ^-r2(oo)  =  0 
P 


It  follows  that: 


(103) 


<r2  +  2ar(oo)  -  £,rJ(oo)  -  2(o  -  £,r(oo))r(oo) 

9(00)  = - p- - ,,  - - =  r(oo) 

2(o-  *,r(oo)) 
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If  a  <  0  and  t(t)  <  r(t)  then  by  letting  t>i(t)  =  r(t)  and  t>2  (t)  =  £(t) 
in  Lemma  3-1  we  conclude  that  £(oo)  =  r(oo)  =  ?(oo)  and  hence  p(oo)  = 
p*(oo)  =  r(oo). 

If  t(t)  is  not  less  or  equal  than  r(t)  for  every  i,  then  we  can  always 
find  a  lower  bound  t(t)  which  is  less  or  equal  than  both  t(t)  and  r(<)  (see 
next  remark).  Thus  we  can  apply  the  same  Lemma  with  va(t)  =  P(t)  and 
conclude  that  t{ oc)  =  r(oo)  =  g(oo)(=  £(oo))  and  hence  p( oo)  =  p*(oo)  — 
r(oo). 


We  now  show  that  (103)  holds  if  a  <  0  and  #o  hold. 

/  €-<  (p(t),  Ap(t)J  implies 

The  condition 

u(<)z  +  /(t,°)  <  f(t,x)  <  fi(t)x  +  f(t, 0) 

(104) 

where  the  time  functions  p(t),p(t)  and  /(t,0)  are  all  bounded  continuous 
for  t  >  0.  Equation  (104)  implies  in  turn  that: 

/2(t,i)<  A2(t)x2  +  B2(t) 

for  some  continuous  bounded  functions  A  and  B.  Therefore, 

lim  X[t)Ef3(t,  xt)  ~  0 

t  —*  OC 

holds  if 

(105) 

lim  A (t)Ex/  =  0 

Ex 2  satisfies  the  following  ODE  [16]: 

(106) 

-j-Ex3  =  1  +  2A(t)£’a:t/(t,ii)  +  2  aEx] 

at 

2 Extf[t,xt)  <  Exj  +  Ef3{t,xt) 

(107) 

Using  (106)  and  (105)  in  (107),  we  conclude  by  the  comparison  theorem 
that  Ex3  is  bounded  by  V (t)  where: 

v  =  1  +  A(t)B*(t)  +  (2a  +  \{i)  +  A(t)A2(t))V(t) 

Perron’s  theorem  applies  and  V (oo)  =  -1/2 a.  Therefore, 

lim  A(t)B/2(t,  xt)  =  0. 

f  “■♦OO 
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QED 

Clearly,  the  same  thing  is  also  true  for  i/(t)£Jfc2(t,  x4). 

Remark:  Let  /  [/i(t),Aji(t)]  and  k  €•<  [?(t),  Af(t)]  i.e., 

/*  e  [m(0»  £(*)]»  **  €  [£(t),  j(t)]  (108) 

The  lower  bound  is  then  given  by: 

l(t)  =  <r2(t)  +  2a(t)*(0  ~  y2[P  +  4^(Aa)2]£2(t);  £(0)  =  o2 

where  o(t)  =  a  +  A(t)p(t),  Aq(()  =  A(i)Ap(t)  and  /?(<)  =  e  +  i/(t)f  (t). 
Clearly,  t[t)  <  r(t)  if  p(t)  <  0  and  f(t)  >  0  where 

HO  =  a1  +  2or(i)  -  ^r2(<);  r(0)  =  a\ 

If  ju(t)  <  0  and  f(t)  >  0  does  not  hold,  then  we  can  always  choose  a 
worse  lower  bound  t(i)  such  that  t(t)  <  r(t).  This  is  possible  since  (108) 
implies  that  fx  €  [p’(t),p(t)];  kz  e  [£,<?'(<)]  with  p'(t)  <  0  and  f'(f)  >  0. 

Let  us  now  turn  to  the  case  where  g  and  h  are  again  given  by  (89). 
Then  under  the  same  assumptions  and  notations  of  Corollary  3-3  and  the 
additional  obvious  additional  assumptions  introduced  by  Mq  (namely  that  it 
holds  for  each  /,,  k}),  it  can  be  shown  [20]  that  the  following  holds: 

Corollary  3-5:  If  a  <  0,  then  both  the  * KF *  and  the  *SSKF ’  are 
asymptotically  optimal  as  t  —*  oo.  Moreover, 

p(oo)  =  pk(oo)  =  r(oo)  =  ^[o  +  (^—c2)1/2]  (109) 

1.4  An  Example 

Example: 
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Figure  1:  BOF  performance 


Let  xt  and  yt  be  given  by: 


dxt  =  axtdt  +  c  1  sin2(wt)  tanh(x()d/  +  <rdwt 
dyt  =  c x,dt  +  ^  1x,e~x>  dt  +  pdvt 

x0  ~  M(m0,<r o) 


IUS, 


(111 


A(t)  =  e~*  mn*(vt);  /(*)  =  tanh(i)^(t)  =  *(x)  = 


xt 


Simulations  were  done  with  the  following  numerical  data: 
a  =  —  lfw  =  50,  <r  =  p  =  0.2c  =  1,  mo  =  0.0,  <r\  —  I 


for  which  it  is  readily  obtained  that  /  €-<  [-515]  and  k  e-<  |  -  ^  ,  lUf- -  ] 

i.e., 

^  =  0,^=  1,M=  i,A^=  1  (112) 


i  =  -2e  l,f  =  l,f  = 


1  -  2e“l 


,Af  = 


l  +  2e"i 


The  simulation  results,  obtained  using  Monte  Carlo  methods,  are  summa¬ 
rized  in  the  plots  of  Figures  1  and  2  corresponding  to  the  BOF  and  “KF” 
respectively.  In  Figure  1,  the  upper  and  lower  bound  (u(t)  ,  £(t))  on  the 
o*.„imal  MS-error  p(t)  :=  E[xt  -  £(*«|l/o)]J  together  with  the  MSE  corre¬ 
sponding  to  the  BOF  are  plotted.  A  similar  is  given  in  Figure  2  for  the 
“KF”  except  that  instead  of  u(t),  r(t)  is  plotted.  (Recall  that  while  1 1  was 
shown  to  be  an  upper  bound  on  the  BOF  MSE  p*(t),  neither  u  nor  r  are 
known  to  be  upper  bounds  for  the  “KF”  MSE  pfc(t)). 


It  can  be  seen  that  the  BOF  and  the  “KF”  are  indeed  both  asymptoti¬ 
cally  optimal  in  the  sense  that  lim*-,,* (p(t)  -  p*(/))  =  0  and  limt_0o(p(t)  - 
p*(t))  =  0  respectively.  Moreover: 

P(  00)  =  p*(oo)  =  p‘(oo)  =  r(oo)  =  ~\a  +  (a2  +  ^c2)i]  =  0.017  (113) 

c*  pl 


1.5  Conclusions 


We  investigated  the  asymptotic  behavior  question  of  one  dimensional  non¬ 
linear  filtering  problems  involving  drifts  with  bounded  derivatives  using  an 
upper  and  lower  bound  approach  to  show  that  the  a  priori  mean  square 
error  associated  with  some  suboptimal  filters  approaches  the  optimal  one 
asymptotically.  The  upper  and  lower  bounds  satisfy  ordinary  differential 
equations  of  the  Riccati  type.  In  particular,  it  is  shown  that  in  the  case 
of  asymptotically  time  invariant  systems  for  which  the  limiting  system  is 
linear,  the  “KF”  and  “SSKF”  (designed  for  the  limiting  linear  system)  are 
asymptotically  optimal  as  t  — ►  00  (section  1.3).  In  other  words  the  nonlin¬ 
earity  can  be  ignored  as  far  as  the  long  time  behavior  is  concerned.  This 
approach  proved  that  significant  information  relevant  to  this  type  of  filtering 
problems  can  be  infered  from  the  knowledge  of  the  derivative  bounds  (i.e., 
f  the  cone  in  which  the  nonlinearities  reside),  and  the  main  point  is  that, 
tractable  bounds  on  the  optimal  MS-error,  when  available,  can  be  used  (in 
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addition  to  performance  testing  of  suboptimal  designs)  as  a  study  approach 
to  tackle  some  questions  arising  in  nonlinear  filtering. 
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2  A  Bound  Approach  to  Filters  for  Weakly  Non¬ 
linear  Systems 

2.1  Introduction 

In  this  section  we  again  consider  the  Ito  stochastic  model: 

dxt  =  g{ttXf)dt  +  o{t)dwt  (114) 

dy t  =  h(t,  xt)dt  +  p(t)dvt  (115) 

i(0)  =  0  <  t  <  T  (116) 

where  g,h,a  and  p  are  smooth  functions  of  their  arguments,  {vt},  {u>t}  are 
independent  Wiener  processes,  and  xq  is  a  random  variable  independent  of 

As  we  have  shown  in  the  previous  section,  the  performance  of  suboptimal 
designs,  however  derived,  may  be  based  on  lower  and  upper  bounds  on  the 
optimal  mean  square  error  (MS-error)  p(t)  [22].  This  approach  is  used  here 
to  investigate  the  long  time  (asymptotic)  behavior  of  a  class  of  nonlinear 
filtering  problems,  namely  weakly  nonlinear  systems  [10]  and  systems  with 
low  measurement  noise  level  [19]-[6].  Systems  of  the  first  type  are  modeled 
as: 

dxt  =  a[t)xtdt  +  e/(t,  xt)dt  +  o(t)dxvt  (117) 

dyt  =  c(t)xtdt  +  p(t)dvt 
while  those  of  the  second  type  are: 

dxt  =  g(t,  X()dt  +  ff{t)du>t  (118) 

dyt  =  h(t,xt)dt  +  edvt 

It  is  well  known  that  for  filtering  problems  of  this  type  there  may  be  no 
finite  set  of  equations  which  propagate  the  conditional  mean. 

We  are  interested  in  (one  dimensional)  suboptimal  filters  which  are 
asymptotically  optimal  in  the  sense  that  the  corresponding  o  priori  mean 
square  error  (MSE)  is  identical,  up  to  some  power  of  c,  to  the  optimal  one. 

Weakly  nonlinear  systems  have  been  studied  in  [6,5,4].  In  [6],  Brockett 
showed  that  in  the  general  case,  even  to  be  optimal  in  the  asymptotic  sense, 
such  filters  must  evolve  in  higher  dimensional  spaces  than  z<  does. 
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One  question  of  particular  interest  is  to  study  the  effect  of  the  weak 
nonlinearity  on  the  filtering  performance.  In  other  words  the  question  is 
whether  the  Kalman  filter  (“KF*),  formally  designed  for  the  underlying 
linear  system  and  driven  by  the  observation  {y(}  in  (117)  is  asymptotically 
optimal  for  small  e  (notice  that  these  are  observations  from  a  nonlinear 
system). 

In  section  2.3,  it  is  shown  that  for  a  particular  class  of  nonlinearities  / 
(those  with  bounded  derivatives),  the  “KF*  and  the  so-called  bound  optimal 
filter  (BOF,  section  2.2),  both  of  which  are  one  dimensional  filters  with 
precomputable  (nonrandom)  gains,  are  asymptotically  optimal  as  e  — *  0. 

Next,  the  low  measurement  noise  case,  first  studied  in  [l9]-[6],  is  treated 
in  section  2.4  where  the  BOF  and  a  constant  gain  version  of  it  are  shown 
to  be  asymptotically  optimal,  in  addition,  an  even  simpler  (not  involving 
the  drift  and  linear)  asymptotically  optimal  filter  is  obtained.  Some  of  these 
results  have  been  obtained  in  (19,6]  by  a  different  approach  (e.g.,  a  WKB  pro¬ 
cedure  applied  directly  to  the  DMZ  equation  in  Fisk-Statonovich  form  [19]), 
while  here,  basic  bounds  on  the  a  priori  optimal  MS-error  and  perturba¬ 
tion  methods  are  used.  Examples  with  simulation  results  are  provided  in 
section  2.5. 


2.2  Lower  and  Upper  Bounds  on  the  Optimal  MS-Error 

Let  us  consider  the  one  dimensional  version  of  (114)  where  xo  is  assumed  to 
be  M{0 ,<To);  9  an<*  h  are  such  that  (114)  has  a  unique  solution  [l],  differen¬ 
tiable  with  continuous  partial  derivatives  satisfying  the  following  hypotheses: 

Hhi  :  !<?*(<,  x)  -  o(i)|  <  Aa(t) 

Hh2  :  | fis(t,x)  -  0(0!  <  A/J(t),  £(t)  :=  0(t)  -  Aj9(t)  >  0 

which  we  denote  by 

g  e*  (a(t),  Aa(t)], 
he* 

define 

&(t)  :=  a(t)  +  A a{t), 


(119) 

(120) 
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a(t)  :=  a(t)  -  Aa(t) 


m  :=  m  +  A/J(t),  (121) 

5(0  :=  5(0  -  A£(0 

P(0  :=  ^(*1  -  E{xt |]/‘))J  (122) 

P*(0  :=  E{xt  -  «?)*  (123) 

where  xf  is  the  BOF  and  is  given  by 

dx*t  =  g(t,  Xf)dt  +  ^|u(t)[dy,  -  h(t,  x*t)dt],  xj  =  0  (124) 

« (0  =  *J(0  +  2a(t)u(t)  -  ^ju2(t),  u(0)  =  (125) 


The  stochastic  process  satisfying  the  above  nonlinear  SDE  is  called  the 
bound  optimal  filter  (BOF).  Clearly,  the  BOF  is  readily  implementable  with 
precomputable  (nonrandom)  gain  and  it  coincides  with  the  Kalman  filter 
when  /  and  g  are  linear.  Moreover,  the  BOF  is  “bound  optimal”  in  the 
sense  that,  among  all  nonlinear  filters  given  by  124  but  with  arbitrary  non 
random,  continuous  gains  Jfc(t),  the  choice  k*(t)  :=  -^u(f)  yields  a  nonlinear 
filter  (the  BOF)  that  has  the  tightest  upper  bound  on  the  corresponding 
MS-error.  Furthermore,  this  upper  bound  is  precisely  u(t)  (see  [22,20,13]). 

The  following  result,  proved  in  [21,20],  provides  explicit  lower  and  upper 
bounds  on  the  (unknown)  optimal  MS-error  p(t). 

Theorem  2-1:  Let  p(t),p*(t)  and  u(t)  be  as  in  (122),  (128)  and  (125) 
respectively.  Then: 


0  <  l(t)  <  p(t)  <  p*(t)  <  u(t) 

where 

<(1)  =  <rl(I)  +  2a(l)/(t)  -  +  4^S(Ao(0),]lI(l)  (126) 

*(0)  =  Vo 
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Remark:  Since  t(t)  and  u(t)  both  satisfy  ODE’s  of  the  Riccati  type,  the 
Theorem  says  that  the  optimal  MS-error  p(t)  in  the  nonlinear  filtering  prob¬ 
lem  is  bounded  by  those  in  two  corresponding  Kalman  filtering  problems, 
the  coefficients  of  which  are  obvious  from  (125)  and  (126). 

Definition:  Let  {z*}  be  any  suboptimal  filter,  p'(t,e)  :=  E(xt  —  z*)J 
and  p(t,e)  :=  E[xt  —  E{xt\yl)\2.  Then  (z'}  is  said  to  be  asymptotically 
optimal  if  p(t,e)  and  p*(t,e)  agree  up  to  some  power  (k  >  \)  of  t  in  a 
nontrivial  way. 

Proof  of  asymptotic  optimality  for  a  given  suboptimal  filter  {z*}  uses 
the  argument  that  if  one  can  bound  p(t,e),  p*(t,f)  as  in 

0  <  f(t)  <  p(t,f)  <  p*(t,e)  <  u*(t,e) 

for  some  tractable  bounds  t‘  and  «*,  then  it  suffices  to  show  that  the  first 
terms  in  the  corresponding  asymptotic  expansions  are  identical. 


2-3  Weakly  Nonlinear  Systems 

Let  zt  and  yt  be  given  by 

dxt  =  g(t,xt)dt  +  tf{t,xt)  +  o{t)dwt,  0  <t<T  (127) 
dyt  =  h[t,xt)dt  +  p(t)dvt 

where  zo  is  M(Q,  <7q)>  are  Brownian  motions  independent  of  zo',  f, 

g,  and  h  have  enough  smoothness  to  guarantee  the  well  posedness  of  (114). 

In  the  case  e  >  0  is  a  small  parameter,  g  and  h  are  linear,  we  call 
these  weakly  nonlinear  systems  (WNL).  WNL  systems  were  studied  in  [10] 
where  it  was  shown  that  if,  e.g.,  f[t,  z)  =  z3,  then  there  does  not  exist  a 
reduced  order  (i.e.,  one  dimensional)  filter  which  has  the  optimal  asymptotic 
performance. 

Our  goal  here  !b  to  exhibit  one  dimensional  filters  that  are  always  asymp¬ 
totically  optimal  for  a  restricted  class  of  nonlinearities  /,  namely  those  with 
bounded  derivatives. 

In  the  next  two  subsections  upper  and  lower  bounds  on  p(t)  :=  E(xt  — 
£(*e!3/o))J>  P*(0  :=  Eixt  ~  *?)*  and  P*(0  :=  E{xt  -  z?)2  (*?,zf  being  the 
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BOF  and  “KF”  estimators  respectively)  are  used  to  establish  that  in  the 
weakly  nonlinear  case,  that  is,  in  the  case  g  and  h  are  linear,  both  filters 
are  asymptotically  optimal  in  the  sense  that  p,p*  and  pk  are  the  same  up 
to  first  order  in  (. 

2.S.1  Asymptotic  optimality  of  the  BOF 

Let  xt  and  yt  be  given  by  (127)  and  assume  that: 

9  [a(‘).  Aa(t)],  /  €-<  [/*(<),  Ap(t)] 

h  [c(t),Ac(t)| 
c(t)  :=  c(t)  -  Ac(t)  >  0,  t  >  0 

We  recall  that  here  the  BOF  x\  is  given  by: 

dx*t  =  g{t,x*)dt  +  ef(t,x*t)dt  +  -^ru(t)[dy,  -  h{t,x*t)dt]  (128) 

i*(0)  =  0 

u  =  o2(t)  +  2(a(t)  +  ep(t))u(t)  -  ^ju2, 
u(0)  =  cl 

Proposition  8-1:  If  Ao(t)  =  Ac(t)  =  0  and  c(t)  >  0,  then  the  BOF  is 
asymptotically  optimal  as  e  — ►  0,  i.e., 

p*(t)~p(t)  =  r(t)  +  0(e),0<t<T  (129) 

where 

r  =  o2(t)  +  2o(t)r(t)-^r2,  r(0)  =  r2  '  (130) 

Remark:  If  furthermore,  the  system  is  time  invariant  then 

P*( 0  =  P(0  =  '(I)  +  2 (p  f  ^(t,s)r(s)d»+  C>(c,  Am)  (131) 

JO 
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-1 


where 


^  P*  /  - 1  —  Ae~M<  1 
rW  CJ  {°  +  tfl  +  Ae-«‘J 


6  =  val  +  ^cS 

^  _  ^(°  +  6)  -  go 
*o  -  £(°  -  *) 

*(*,•)  =  eJa(t-*)  exp  |-2^  f'r(r)dr  J  (134) 

here  0  (x,  y)  means  order  of  each  one  of  the  arguments  separately. 

Proof:  It  readily  follows  from  the  above  assumptions  that  (g  +  e/)  €-< 
[«(*)  +  <p(t),  Aa(t)  +  cAp(t)]. 

From  Theorem  2-1  we  get 

0  <  £(t )  <  p(t)  <  p*(t)  <  u(t)  (135) 

where: 

u  =  <r’(t)  +  2(a(t)  +  e/2(f))u  -  (136) 

«(0)  =  «r* 

t  =  e*{t)  +  2(a{t)  +  €lLm- ^y[el(0  +  4^y(M0  +  cM0)2]«2  (137) 


/(0)  =  ag 


expanding  u(t)  in  the  form: 


gives: 


«(0  ~  X)  «.(*)«* 

«=o 

(138) 

«*(*)  ~  XC|kC* 

4=0 

(139) 

ft 

c*  =  5^«,(t)Un-,(t) 

>=o 
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Plugging  (138)  and  (139  in  (136)  and  equating  powers  of  <  yields: 


uo  =  <rl(*)  +  2a(t)uo  -  «o(0)  =  °l  (14°) 

rl'J 

ill  =  2[fi(t)  -  +  2£(t)u0(t),  Ui(0)  =  0  (141) 

Proceeding  similarly  for  l(t),  one  obtains: 

<o  =  «’(.)  +  -  i[e’(t)  +  ))H 

*o(0)  =  al 

it  =  2[o(0  -  ^y(«J(0  +  4^5a2(t))/o]*i  +  2M(t)«o  -  (143) 

<i(0)  =  0 

(here  6a 2  :=  (6a)2) 

It  is  clear  from  (140  and  (142)  that  «o(t)  and  to{t)  are  different  in  the 
general  case  but  coincide  with  r(t)  if  6a  =  6c  —  0  that  is: 

g[t,  x)  =  a(t)x  and  h{t,x)  =  c[t)x 

Now  if  the  system  is  time  invariant  i.e., 

a(t)  =  a,  n{t)  =  p,  c(t)  =  c,  <r(t)  =  a  and  p(t)  =  p 

then  one  easily  gets  the  results  in  the  remark  above  by  using  the  Riccati 
transformation  r  —  £  —  to  solve  (130)  and  the  variation  of  constants  formula 
in  (141)  and  (143). 

2.S.2  Asymptotic  optimality  of  the  KF 

The  question  considered  here  is  whether  one  could,  in  the  case  of  weakly 
nonlinear  systems,  ignore  the  nonlinear  part  in  the  drift,  use  the  Kalman 
filter  designed  for  the  underlying  linear  system  (driven  by  ft)  and  be  able  to 
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achieve  asymptotic  optimality  as  e  — ♦  0.  It  is  important  however  to  notice 
that  even  though  this  scheme  is  being  refered  to  as  the  “KF* ,  it  has  little  to 
do  with  the  regular  Kalman  filter,  the  reason  being  that  the  “KF”  is  driven 
by  observations  from  a  nonlinear  system. 

Accordingly,  Let  g{t,x)  =  a(t)x,h(t,x)  =  c(t)x  and  assume  that  /  €-< 
[p(t),6/i(t)],e(t)  >  0,  then  the  “KF"  is  given  by: 

dxkt  =  a(t)xkdt  +  ^yr(t)[dy,  ~  «(*)*?*)»  i*(0)  =  0  (144) 

where  r(t)  is  as  in  (130). 

Proposition  3-2:  Under  the  above  assumption,  the  * KF 9  is  asymptot¬ 
ically  optimal  as  e  — >  0  tn  the  sense  that: 

pk(t)  ~  p(t)  =  r(t)  +  0(e)  0  <  t  <  T 

Proof:  We  first  derive  an  upper  bound  on  pk(t)  :=  E(xt  —  xk)2  where 
**  is  given  by  (144). 

Let  it  :=  xt  -  x£;  then 

dxt  —  [<?«  -  c(t)G(f)xt]dt  +  o(t)dwt  -  p(t)G(t)dvt  (1*5) 

where  G(t)  :=  ^r(t)  and  gt  =  a(t)xt  +  ef(t,  xt).  Applying  Ito’s  chain 
rule  [16]  gives 

dx2  =  [a7  +  p7G2\dt  +  2  xtdxt  (146) 

Taking  expectations  on  both  sides  yields: 

^tEit  =  pk(t)  =  a7  +  p2G2  +  2 Ext[gt  -  cGxt] 

pk(t)  =  a2  +  p2G 2  +  2Eitgt  -  2 cGx2,  p*(0)  =  <7q 

p*  =  a2  +  p2G2  +  2(o  -  cG)pk  +  2 eEitf(t,  xt )  (147) 

Clearly 

2  Eitf(t,  xt)  <  Ei 2  +  Ef2(t,  xt)  -  p‘(t)  +  Ef2(t,  xt)  (148) 
By  the  comparison  theorem  (see  appendix):  pk(t )  <  ?(t),?(0)  =  where 
q(t)  =  o2  +  p2G2  +  2(o  —  cG)q  +  e(q  +  Ef2) 
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=  a2  +  p2G2  +  eEf 2  +  [2(o  -  cG)  +  e]q 

which  we  rewrite  as 

q  =  i(t)+j(t)q,  q{0)  =  ol 

i(t)  =  a1  +  ^ra  +  eE/J(t,  xt) 

j(t)  =  c  +  2[o  -  ^r(t)] 

We  therefore  have  the  following  bounds: 

l(t)  <  p(0  <  P*(0  <  9(0 

where:  2 

/  =  +  2(a  +  cM)£  -  ~[c 1  +  4  ^Sp2e2]l2 

p*  o 

m  = 

Expanding  q(t)  in  the  form: 

9(0~f>i(0<‘ 

«=o 

and  equating  powers  of  c  yields: 

90  =  *2(0  +  ^)r*(0  +  2[°(0  “  ^|yr(0l9°,  9o(0)  =  o2 

Let  w  :=  q0[t)  -  £o(t)  •  Then  from  the  previous  section  it  follows  by  making 
6a  =  0  in  (142)  that  w(t)  =  q0(t)  -  r(t).  By  differentiating  we  get 

*(<)  =  p§r’<‘> + Jl“(,) '  r(1)1,s "  2o(1)r(‘)  +  7§)' r'(,) 

This  in  turn  easily  becomes: 

*  =  «>(»)  =  o 

The  solution  of  which  clearly  is  u>(t)  =  0  which  implies  90  =  r. 


(149) 

(150) 

(151) 

(152) 
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2.4  Low  Measurement  Noise  Level 
Consider  the  system: 


dxt  —  g(t,zt)dt  + <r(t)dwt  (153) 

dyt  —  h(t,  Xt)dt  +  fdvt 

where  g  e-<  [o(t),5a(t)]  h  €-<  (c(f),$c(t)],fi(t)  >  0,t  >  0  and  e  >  0  is  a  small 
parameter  (this  is  the  ease  in  many  practical  situations  [9,6]). 

The  optimal  a  priori  MS-error  is  bounded  from  above  and  below,  per¬ 
turbation  methods  for  the  bounds  are  used  to  show  that  the  upper  bound 
approaches  the  lower  one  as  e  becomes  smaller. 

The  result  is  quoted  for  h  linear  but  holds  for  nonlinearities  h  which 
tend  asymptotically  to  be  linear,  i.e.,  Be  is  small  (see  remark  2).  This  type 
of  (almost  linear)  nonlinearities  arise  in  practice  and  are  usually  modeled  as 
being  linear  [13]. 

Proposition  4-1:  Assume  that  6c  —  0  (i.e.  h  is  linear)  and  c(t)  >  0, 
then  the  optimal  MS-error  p(t)  satisfies  the  following 

P (*)  =  + 1(«)  =  Ei*t  ~  xf  )2 

where  lim«_,o  ^  =  0  and  xf  denotes  anyone  of  the  three  asymptotically 
optimal  filters  listed  below. 


(Fi)  The  BOF: 


=  «(<.*,*)<*  +  -  e(<W*I. 

z*(0)  =  0 

(155) 

i(l)  =  <'’(<) + 

u(0)  =  oft 

(156) 

(Fj)  The  constant  gain  BOF  ( CGBOF): 

dz ‘  =  g{t,x\)dt  +  ^[dy,  -  cz'tdt], 

*ie(0)  =  0 

(157) 
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(Fj)  The  linear  (first  approximation )  BOF: 

dx\  —  — —  [dy*  -  c(f)xfdt],  **(0)  =  0  (158) 

Equation  (154)  is  proven  for  each  case  separately. 

Proof  of  (fi):  From  Theorem  2-1  we  get: 

1(1)  <  p(t )  <  p*(0  =  E(xt  -  x*t  )2  <  u(t)  (159) 

u  =  «rl(t)  +  2a(t)u  -  — ^uJ,  u(0)  =  o\  (160) 

l  =  »■(!)  +  (161) 

6(0)  =  a? 

It  can  be  easily  seen  by  inspection  of  (160)  and  (161)  that  u(t)  and  l(t)  are 
of  different  order  in  c  if  6c  is  nonzero.  Let’s  show  this  explicitly. 


Expanding  u(t)  as 


yields 


«W  ~  £  u"(0<" 


«J(0  ~  ]£ 

n=0 

)=0 

Mi)  =  «o(0 
di(t)  =  2«o(t)ui(t) 
dt(t)  =  2uqU2  -(-  uj 


Plugging  (162)  and  (163)  in  (160)  gives: 


jr  une"  =  o1  (t)  +  28  £  unen  -  ^  dBe" 

•wA  A  ^  n 
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Equating  powers  of  c,  starting  with  e  a,  yields  do  =  0,  i.e.,  uo(t)  =  0.  This 
in  turn  implies  that  di  —  0. 

Similarly  o7  -  c7di  =  0.  But  since  d%  =  uj,  it  follows  that  ui(t)  =  Zfjj, 

i.e., 

u (t)  =  +  0(e2)  forever  j/0  <  t  <  T  (165) 

By  a  similar  procedure  we  get  to  =  0  and  t\  —  ||||  that  is 

l(t)  =  ^fe  +  0(ea)0<t<T  (166) 

We  conclude  from  (165)  and  (166)  that  if  6c  =  0,  i.e.,  h{t,x)  =  c(t)x 
then: 

«(0-^)  =  ^|c+O(e»)°<t<T  (167) 

which  establishes  the  asymptotic  optimality  of  the  BOF  as  c  -♦  0. 

Note:  These  approximations  are  obviously  not  valid  in  the  immediate 
vicinity  of  t  =  0  where  u(0)  =  £(0)  =  ofi-  This  (boundary  layer)  problem  is 
negligible.  It  can  indeed  be  easily  shown  that  the  duration  of  the  transient 
regime  for  this  type  of  ode’s  is  0(e)  (also  see  Figure  2). 

This  suggests  the  following: 

(i)  since  u(t)  =  eui(t)+0(e2),  one  can  replace  u(t)  in  (148)  by  eui  =  e^||j 
and  hope  to  achieve  asymptotic  optimality  as  well.  The  new  filter  clearly 
would  have  the  advantage  that  the  gain  k(t)  =  thus  avoiding  solving  a 
Riccati  equation  and  therefore  resulting  in  faster  computations. 

(ii)  If  the  answer  to  (i)  is  affirmative,  the  next  question  is  whether 
the  same  thing  would  hold  for  the  first  approximation  (when  expanding  xf) 
filter: 

dx\  =  ^-\dyt  ~  c(f)x(dt] 

It  turns  out  that  both  filters  are  asymptotically  optimal  as  is  shown  next. 
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Proof  of  (fj):  An  upper  bound  on  the  MS-error  corresponding  to  filters 
such  as  (fj)  can  be  obtained  by  following  the  first  steps  in  the  proof  of 
Proposition  S-2  (also  Section  2-2  in  [21].  In  this  case 

E(xt  -  <  «fc(o 

where  (ifc(t)  =  ^): 

uk  =  2<r*(t)  +  2[fi(t)  -  -^C^]uV(0)  =  cl  (169) 

By  setting  uk(t)  ~  £,■  uk(t)e*  in  (169),  one  easily  obtains 

«o(0  =  0,  u{(t)  = 

hence: 

p(t)  =  E(zt~z^i  =  ^e  +  0(ei),  0  <  t  <  T 

(Recall  that:  p(t)  >  t(t)  =  fgjc  +  0(e2)) 

Proof  of  (F3):  Similarily,  it  is  readily  obtained  that  pe(t)  :=  E\xt  -  if]2 
satisfies 

pl  =  2 c\t)  +  2 E(zt  ~  z*)g{t,  zt)  -  2^V 
Using  the  Schwartz  inequality: 

Eab  <  Etat.Eh2 

and  the  comparison  theorem  (see  appendix)  we  get  pe(t)  <  ue(t)  where 

u*  =  2e2(t)  +  2 ff(t)(ue)i  -  2~Wj^uL  (170) 

with  0(t)  =  E$g2(t,  zt).  Expanding  u*  ~  ^o°  u<€*  *n  (170)  and  equating 
powers  of  (  gives  u§  =  uf  =  0  and  u$  =  ||||,  hence, 

Therefore, 

p(t)=p«(t)  =  fMe  +  l(c),  0<t<T 
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Remark  (l):  (i)  If  o(t)  =  a  and  e(t)  =  e  then  ^i(t)  =  £i(t)  =  0  and 
the  next  terms  in  the  expansion  of  u(t)  and  i(t)  are: 

«i(‘)  =  ^a(0 

m  =  ^a(') 

so  that  u(t)  =  t(t)  +  0  (c3)  if  and  only  if  6a  =  0,  i.e.,  both  g  and  h  are  linear. 


(li)  In  [14],  it  was  shown  that  for  incrementally  conic  nonlinearities  we 
have  the  following  lower  bound  l(t): 


p(t)  >  l{t)  =  (1  -  «(t))r(t) 

(172) 

where  s(t)  is  the  unique  nonnegative  root  of 

(i  -  .(.))«•(<)  -  .-«*• 

(173) 

(174) 

«  =  «’(<)  +  ^r*(0  +  J[«(l )  -  ^-r), 

(175) 

q(0)  =  <rl 

r  =  <r2(t)  +  2a(t)r  -  r(0)  = 

(176) 

From  (160)  and  (165)  we  readily  get  that  r(t)  =  |^jc  +  0(c2).  It  is 

therefore  clear  from  (172)  that  if  e(i)  =  0(c),  then  t(t)  =  +  0(c2)  the 

same  as  the  one  we  have  used  here. 

This  is  indeed  the  case:  (175)  implies  q[t)  =  0(c)  and  (174)  that  d(t)  — 
0(c)  (Sc  =  0).  Assuming  #(t)  ~  «nc"  and  letting  c  go  to  sero  in  (47) 

gives  that  1  -  «o  =  e~*°  necessarily.  This  has  the  unique  solution  *o  =  0, 
hence  s(t)  =  0(c). 

Remark  (2):  Almost  linear  obeervations. 


The  same  results  in  previous  proposition  can  be  extended  to  the  partic¬ 
ular  class  of  nonlinearities  h  €-<  [c,  6c]  where  Sc  is  also  a  small  parameter. 
Indeed,  the  upper  and  lower  bounds  u  and  l  on  p(t)  and  p*(t)  :=  E(xt  —  x *)3 
where  xf  is  the  BOF  in  (Fi)  (with  cx\  and  c  replaced  by  h(x£)  and  l)  are 
given  by  (165)  and  (166): 


=  ^<(1  +  7  +  0((«c)1))  +  0(<*) 

=  iMe+;rfe+0(<1)+"5((<c)!) 

Thus,  for  small  Sc 

“(0  =  0(c,Sc) 

Similarly 

<(0  =  ga,+o(eJ) 

=  ^J<(1-|  +  0((<c)’))  +  0(,i) 

It  is  not  hard  either  to  establish  that  for  the  analogs  of  the  filters  (F2)  and 
( F3 )  (as  in  (157)  and  (158),  but  with  cx  replaced  by  h(x))  the  upper  bounds 
are 

and 

M<(t)=£W€  +  0(t3OVer2) 

which  makes  these  filters  asymptotically  optimal  as  Sc  and  c  become  smaller 
with 

*■>  =  $< +'<«). 
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Application  to  the  Benae  filter  Let 


dxt  =  f(xt)dl  +  dwt 

(177) 

dyt  =  Xtdt  +  dvt 

(178) 

where  the  drift  /  satisfies 

/*(*)  +  /*(*)  =  ax 2  +  bx  +  c 

(179) 

with  a  >  0  to  prevent  finite  time  escape  situations. 

As  mentioned  earlier,  this  is  one  of  the  few  nonlinear  filtering  problems 
which  was  shown  to  admit  a  finite  number  of  sufficient  statistics  [3].  We 
are  interested  here  in  investigating  this  class  of  filtering  problems  when  the 
diffusion  process  (z<)  is  measured  in  a  low  noise  channel.  In  particular,  we 
would  like  to  know  what  type  of  implementation  simplifications  will  result 
from  this  additional  assumption.  Accordingly,  let  {zt}  be  as  in  (177)  and: 

dyt  =  xtdt  +  edvt  (180) 


In  order  to  know  how  e  enters  Benes’  original  formulas,  we  shall  trans¬ 
form  the  DMZ  equation  in  Fisk-Stratonovich  form  by  following  the  steps 
outlined  below.  The  unnormalized  pdf  u(t,  z)  satisfies  the  following  stochas¬ 
tic  PDE: 

du  =  (£*(u)  -  1  ^ u)dt  +  ^ udy  (181) 

£*(«)  =  ~  (/“)* 

which  in  our  case  is 

1  1  zl  z2 

du  =  -  /u,  -  (/,  +  2^2  )«]«**  +  ^2  udy  (182) 


By  letting  V(t,z)  =  z),  the  stochastic  differentials  in  (182) 

are  eliminated.  We  obtain  the  following  classical  PDE  (robust  DMZ): 

=  \v~ + (?  -  f)y.  -<£/+/.+ -  \$yr  <»3) 
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Using  V(t,x)  =  eh  ^°^p(ttx)  and  (179)  in  (183),  we  get  after  some  com¬ 
putations  tha+: 

1  1  ,1 V?  1  1  „  J  v  j  1.  1  , 

*  =  j*s  +  2»/».  +  [5?  --?(i  +  ca)x  -~te--c]p 
It  can  be  easily  verified  that  p  is  given  by 

where 

0(t)  =  1-^(1 +  «*«)•*(*)»  0(0)  =  0 

dpt  =  -^(l  +  f2a)8(t)ptdt-^8{t)bdt  +  \t(t)dyt  (184) 

u(t,x)  =  eh’v,exp^  J{a)d<r-  (185) 

Our  goal  is  to  see  under  what  circumstances  pt  can  be  a  good  approxi¬ 
mation  for  the  conditional  mean  £(z<|3/o)  given  by: 

*1  ‘■W  =  /*T^‘fe 

It  turns  out  that  for  cone  bounded  drifts  in  (179)  (e.g.,  f(x)  —  tanh(x) 
or  linear),  the  following  holds. 

Claim:  {pt}  is  asymptotically  optimal  as  e  — >  0 

To  see  this,  rewrite  (184)  in  the  more  suggestive  form: 

dpt  =  9-^r\dyt  ~  (1  +  *2a)Ptdt]  -  ^ 9(t)bdt 

and  notice  that  8{t)  =  — — ~  |-  tanh((l  +  e*o)l)  ~  e  +  0( c3).  It  is  not  hard 
then  to  show  that  pt  =  pi  +  0  (c)  where 

dpi  :=  -  p\dt ] 
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is  precisely  the  line&r  BOF  obtained  in  last  proposition  which  was  shown  to 
be  asymptotically  optimal  as  <  becomes  smaller. 

Notice  that  for  the  particular  case  f[x)  —  tanh(z),  a  =  b  =  0  and  c  =  1 
and  hence 

dpt  -  -  mdt) 

2.5  Examples  and  Simulation  Results 

Example  1:  In  this  example,  the  asymptotic  optimality  of  “KP"  for  WNL 
systems  is  illustrated.  We  consider: 

dxt  =  aifdt  +  e  tanh  (xt)dt  +  trdwt 
dyt  =  eXfdt  +  pdvt 

zo  ~  >/(mo,«r2) 

where  /(•)  =  tanh(-)  €•<  [5, 5],  i.e.,  p  =  0,  fi  =  l,p=  A/i  = 

Simulations  were  done  using  a  Monte  Carlo  technique  and  the  following 
numerical  data: 


a  =  -1  ,<r  =  p  =  0.3, c  =  l,mo  =  O,o0  =  0.1 


The  results  are  summarized  in  the  plots  of  Figures  3,4,  and  5.  which 
correspond  to  different  values  of  e  (e  =  0.2, 0.1  and  0.05  respectively).  In 
each  figure,  we  have  plotted  p*(t)  :=  E(xt  -  xf)2,  r(t)  and  £(t);  the  latter 
being  the  lower  bound  on  the  optimal  MS-error  p(t)  which  therefore  lies 
between  l[t)  and  p*(t). 

The  plots  appear  to  corroborate  the  results  of  Proposition  3-2  in  which 
it  is  stated  that  the  *KF”  is  asymptotically  optima]  as  t  becomes  smaller 
and  that  r(t)  is  a  good  approximation  for  the  (unknown)  optimal  MS-error 
p(t)  in  the  sense  that  p*(t)  ~  p(t)  =  r(t)  +  0(e). 

Example  2:  This  second  example  deals  with  the  asymptotic  optimality 
of  the  BOF  and  CGBOF  in  the  case  of  low  measurement  noise  level  filtering 
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Asymptotic  Optimality  Of  The  “  lvF' 
For 

Weakly  Nonlinear  Systems 


Asymptotic  Optimality  Of  The  “  KF 
For 

Weakly  Nonlinear  Systems 


Figure  6:  BOF  performance 

problems.  The  following  model  is  considered: 

dxt  =  arctan(zf)dt  +  trdwt 

dyt  =  cxtdt  +  cdvt  (186) 

x0  ~  >/(m0,<7o) 

where  g[-)  =  arctan(-)  €-<  [§,  1],  i.e.,  a  =  6a  =  1  and 

a  =  -l,<r  =  c  =  l,mo  =  0,  a\  =  0.5 

The  simulations  are  summarized  in  Figures  6  and  7  which  correspond  to 
the  performance  of  the  BOF  and  CGBOF  respectively.  Each  figure  contains 
3  sets  of  plots  corresponding  to  c  =  0.3, 0.1  and  0.05  from  top  to  bottom. 
Each  set  of  3  curves  consist  of  the  upper  bound  u(t)  on  the  BOF,  the  MS- 
error  (t)  =  E(xt-  xf}1  and  the  lower  bound  t(t)  on  the  optimal  MS-error 
p(0- 
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Asymptotic  Optimality  Of  The  CGBOF 
Low  Measurement  Noise  Channel 


t  —  0.3 


«  (0 


/  (<) 


t  -0.1 


«  —  0  05 


t — j — i — i — i — i — | — i — (  i — ! — |  i  i  i  i  |  i  i  r~r 

12  3  4 


Figure  7:  CGBOF  performance 
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Again,  these  plots  seem  to  agree  with  the  results  of  Proposition  4-1 
in  which  it  is  stated  that  the  BOF  and  CGBOF  are  both  asymptotically 
optimal  as  c  becomes  smaller  and  that  (equal  to  e  here)  >s  a  good 
approximation  for  the  (unknown)  optimal  MS-error  p(t). 

Remark:  It  can  be  seen  in  Figure  2(b)  that  the  MS-error  pc(t)  exceeds 
the  (BOF)  upper  bound  u(t)  in  all  three  cases  as  might  be  expected.  To  see 
why  this  is  so,  it  suffices  to  recall  that  the  CGBOF  was  obtained  by  approx¬ 
imating  the  BOF  gain  **(*)  :=  ^u(t)  by  2^1  since  u(t)  ~  $fe-  However, 
it  was  remarked  earlier  that  this  last  approximation  does  not  hold  in  the 
immediate  vicinity  of  t  —  0  (boundary  layer  problem).  Outside  this  region 
(which  shrinks  to  sero  as  e  — *  0),  the  CGBOF  performs  in  a  comparable 
fashion  than  the  BOF  with  the  speed  advantage. 


2.6  Conclusions 

We  investigated  the  asymptotic  behavior  question  of  one  dimensional  non¬ 
linear  filtering  problems  involving  drifts  with  bounded  derivatives  using  an 
upper  and  lower  bound  approach  to  show  that  the  a  priori  mean  square 
error  associated  with  some  suboptimal  filters  approaches  the  optimal  one 
asymptotically.  This  approach  demonstrates  that  significant  information 
can  be  infered  from  the  derivative  bounds  (i.e.,  of  the  cone  in  which  the 
nonlinearities  reside).  In  particular,  it  is  shown  that  in  the  case  of  weakly 
nonlinear  systems,  that  the  “KF”  (designed  for  the  underlying  linear  sys¬ 
tem)  is  asymptotically  optimal  as  e  — ►  0.  In  other  words  the  nonlinearity 
can  be  ignored  as  long  as  the  asymptotic  behavior  is  concerned. 

In  the  case  of  diffusions  measured  in  a  low  noise  channel,  three  asymp¬ 
totically  optimal  filters  were  obtained,  one  of  which  is  linear.  Furthermore, 
asymptotic  values  for  the  unknown  optimal  MS-error  were  obtained  in  both 
cases. 

The  main  point  is  that  upper  and  lower  bounds  on  the  optimal  MS-error, 
when  available,  may  be  used  (in  addition  to  permance  testing  of  subopti¬ 
mal  designs)  as  a  relatively  simple  tool  to  study  certain  nonlinear  filtering 
problems. 
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Theorem  (1):  (Comparison  Theorem  [15])  Let  F(x,y)  end  G(x,  y)  be 
continuous  in  the  rectangle 

D:|x-x0|<a,  |y-yo|<& 

and  suppose  that  F(x, y)  <  G(x,y)  everywhere  t'n  D.  Let  y(x)  and  x(x)  be 
the  solutions  of 

V  =  v(zo)  =  a  (187) 

i  =  G(x,y),  x(x0)  =  a 

Let  I  be  the  largest  subinterval  of  (xo  -  a,  xo  +  a)  where  both  y(x)  and  z(x) 
are  defined  and  continuous;  then  for  x  6  / 

*(x)  <  y(x),  x  <  x0 

z(x)  >  y(x),  x  >  xo 

Theorem  (2):  (Perron  [12])  //  F(t),  /,(t),  *o  €  [0,  oo[,i  =  l,...,n, 
are  real  continuous  functions  of  t  having  finite  limits 

Urn  F(t)  =  b,  lim  /,  =  a,, 

t—oo  t-~*oo 

if  the  roots  A,, »  =  1, . . . ,  n  of  the  equation 

pn  +  o\pn  1  +  . . .  +  an  —  0 

are  real,  distinct,  and  different  from  0,  then  the  equation 

^y(<)  +  /i(0^rrrv(0  + . . .  ■ +  fn{t)y{t)  =  F(t)  (iss) 

has  at  least  one  solution  y[t)  with  lim<_00  y{t)  =  lim<_0O  (t)  = 

0.  If  Xi  <  0 ,*  =  then  all  solutions  of  ( 188 )  have  these  properties. 
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3  Optimal  Sensor  Scheduling  in  Nonlinear  Fil¬ 
tering 

3.1  Introduction 

5.1.1  Motivation  and  preliminaries 

The  problem  of  nonlinear  filtering  of  diffusion  processes  has  received  considerable  atten¬ 
tion  in  recent  years;  see  the  anthologies  [l],  [2],  [3]  for  a  review  of  important  developments. 
In  current  studies  as  well  as  in  related  analyses  of  the  partially  observed  stochastic  control 
problem  with  such  models  (4],  (5],  a  key  role  is  played  by  the  linear  stochastic  partial  differ¬ 
ential  equation  describing  the  evolution  of  the  unnormalized  conditional  probability  measure 
of  the  state  process  given  the  past  of  the  observations,  the  so  called  Zakai  equation. 

A  significant  byproduct  of  these  advances  is  the  feasibility  of  analyzing  complex  signal 
processing  problems,  including  adaptive  and  sensitivity  studies,  in  an  integrated,  systematic 
manner,  without  heuristic  or  adhoc  assumptions.  A  problem  of  interest  in  this  area  is  the 
so  called  sensor  scheduling  problem.  Roughly  speaking  this  problem  is  concerned  with  the 
simultaneous  selection  (according  to  some  performance  measure)  of  a  signal  processing  scheme 
together  with  the  sensors  that  collect  the  data  to  be  processed.  Particular  applications  include 
multiple  sensor  platforms,  distributed  sensor  networks,  large  scale  systems.  For  example,  in 
a  multiple  sensor  plattorm,  there  is  definite  need  for  coordinating  the  data  obtained  from 
the  various  sensors  which  may  include  radar,  infrared,  sonar,  etc.  The  data  obtained  from 
different  sensors  are  of  varying  quality  and  a  systematic  way  is  needed  for  allocating  confidence 
or  basing  decisions  on  data  collected  from  different  types  of  sensors.  For  example  radar  sensors 
are  more  accurate  than  infrared  sensors  for  long  range  tracking  while  the  opposite  is  true  for 
short  range  tracking.  In  sensor  networks  one  needs  to  coordinate  data  collected  from  a  large 
number  of  sensors  distributed  over  a  large  geographical  area.  Conflicts  should  be  resolved 
and  a  preferred  set  of  sensors  must  be  selected,  over  finite  (short)  time  intervals,  and  utilized 
in  detection,  estimation  or  control  decisions.  Similarly  in  large  scale  systems  there  is  typically 
an  attached  information  network  with  the  objective  of  collecting  data,  processing  them  and 
making  the  results  available  to  the  many  control  agents  for  their  decisions  (actions).  Again 
the  need  for  coordinating  this  information  in  a  systematic  way  is  critical. 

In  such  sensor  scheduling  problems  the  systematic  utilization  of  sensors  should  be  the 
result  of  optimizing  reasonably  defined  performance  measures.  Clearly  these  performance 
measures  shall  include  terms  allocating  penalties  for  errors  in  detection  and/or  estimation. 
But  more  importantly,  they  must  include  terms  for  costs  associated  with  turning  sensors 
on  or  off,  and  for  switching  from  one  tensor  to  another.  Examples  of  such  costs  arising  in 
practice  abound.  Turning  on  a  radar  sensor  increases  the  detectability  of  the  platform  (since 
radars  are  active  sensors)  and  this  should  be  reflected  as  a  switching  cost.  Deciding  to  use 
a  more  accurate,  albeit  more  complex  sensor,  will  require  higher  bandwidth  communications 
and  often  more  computational  power  allocated  to  that  sensor.  In  distributed  sensor  networks 
it  may  mean  the  physical  movement  or  a  sensor  carrying  platform  (such  as  a  helicopter 
or  airplane)  to  a  particular  geographical  location.  In  large  scale  systems  the  utilization  of 
several  (often  hundreds)  sensors  for  decision  making  may  provide  belter  average  performance 
but  it  certainly  reduces  the  response  speed  of  the  system  to  changing  conditions,  and  it 
increases  computational  and  communication  costs  both  in  terms  of  hardware  and  software. 
The  latter  arc  obviously  evident  in  large  computer/communication  networks.  These  running 
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and  swithching  costs  wit!  depend  often  on  the  part  of  the  state  space  occupied  by  the  state 
vector,  i.  e.  they  will  be  functions  of  the  state  as  well.  For  example  sensors  have  different 
accuracy  or  noise  characteristics  when  the  state  process  takes  values  in  different  areas  of 
the  state  space.  Also  there  is  cost  associated  with  handling  the  transfer  of  information,  or 
tracking  record,  when  there  are  changes  in  the  set  of  sensors  used;  and  these  costs  often 
depend  on  the  state  process. 

It  is  not  our  intent  to  provide  an  extensive  description  or  applications  here.  Detailed 
descriptions  of  some  of  these  problems  can  be  found  elsewhere;  see  for  example  [6],  [7].  The 
underlying  thread  in  all  these  problem  areas  is  the  existence  of  a  variety  of  sensors,  which  pro¬ 
vide  data  (for  processing)  including  information  of  widely  varying  quality  about  parameters 
or  variables  of  interest,  for  control,  detection,  estimation  etc.  Due  to  the  complexity  of  these 
problems  it  is  important  to  develop  systematic  conceptual,  analytical  and  numerical  methods 
for  their  study  and  to  reduce  reliance  on  ad  hoc,  heuristic  methods  as  much  as  possible.  The 
present  paper  is  offered  as  a  contribution  in  this  direction.  It  provides  a  general  methodology 
to  this  problem  by  reducing  it  to  the  analysis  of  a  system  of  quasi-variational  inequalities  (see 
section  3  for  details).  Numerical  methods  will  be  described  elsewhere  (13). 

The  sensor  scheduling  problem  is  considered  here  in  the  context  of  non-linear  filtering  of 
diffusion  processes,  and  is  therefore  applicable  to  detection  problems  with  the  same  signal 
models.  Modifications  of  the  results  apply  to  other  situations  including  control.  In  the  next 
section  we  present  a  somewhat  heuristic  definition  of  the  problem,  intended  to  describe  the 
problem  clearly,  at  an  intuitive  level.  The  intricasies  of  establishing  this  model  in  a  rigorous 
mathematical  fashion  are  given  in  section  2,  and  constitute  one  of  the  main  contributions  of 
the  paper. 

8.1.2  Preliminary  description  of  the  problem 

The  problem  considered  is  as  follows.  A  signal  (or  state)  process  x(-)  is  given,  modelled 
by  the  diffusion 

dz(t)  =  f{x[l))dt  +  y{x{t))dw{t)  (1.1) 

*(0)  =  e 

in  JRn.  We  further  consider  M  noisy  observations  of  x(-),  described  by 

«fy*(0  =  h*(x(t))dt  +  (1.2) 

v’(0)  =  0 

with  values  in  2R*\  Here  «;(•),  w'(-)  are  independent,  standard,  Wiener  processes  in  JRn,  Ir¬ 
respectively,  and  Ri  =  Rf  >  0  are  d,  x  d,  matrices.  Further  mathematical  details  on  the 
system  (l.l),  (1.2)  will  be  given  in  section  2.  Let  us  consider  a  finite  time  horizon  (0,7*J.  To 
formulate  the  problem  of  determining  an  optimal  utilization  schedule  for  the  available  sensors, 
so  as  to  simultaneously  minimise  the  cost  of  errors  in  estimating  a  function  of  x(-)  and  the 
costs  of  using  as  well  as  of  switching  between  various  sensors,  we  need  to  specify  these  costs. 
To  this  end,  let  e,(x)  denote  the  cost  per  unit  lime  when  using  sensor  l,  and  the  state  of  the 
system  is  *;  A,,(x),  *^(x)  denote  the  cost  for  turning  off,  respectively  on,  the  ith  sensor  when 
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the  state  of  the  system  is  x.  The  objective  of  the  performed  signal  processing  is  to  compute, 
at  time  T,  an  estimate  }(T)  of  a  given  function  ^(z(Tj)  of  the  state.  Penalties  for  errors  in 
estimation  are  assesed  according  to  the  cost  function 

£<«,(*(.(!•))  -  *(r»)  :=  E{W*fr»  -  (IS) 

We  shall  comment  briefly  on  more  general  estimation  problems  in  section  4  of  this  paper. 
In  particular  the  consideration  of  a  quadratic  c,(-)  is  not  a  serious  restriction. 

We  consider  next,  the  set  of  all  possible  tensor  activation  configurations,  denoted  here  by 
Xf .  An  element  1/  £  Xf  is  a  word  of  length  M  from  the  alphabet  {0,1}.  If  the  tK  position 
is  occupied  by  an  1,  the  tth  sensor  is  activated  (used),  if  by  a  0  the  tk  sensor  is  off.  There 
are  N  =  2M  elements  in  Xf.  A  schedule  of  sensors  is  then  a  piecewise  constant  function 
«(•)  :  (O.TJ  — *  XI.  We  let  Tj  £  [0,7*]  denote  the  instants  of  changing  schedule;  i.  e.  ,  the 
moments  when  at  least  one  sensor  is  turned  on  or  off.  At  such  a  switching  moment,  suppose 
the  schedule  before  is  characterized  by  1/  £  XI,  and  after  by  1/  €  Xf.  Then  the  switching  cost 
associated  with  such  a  scheduling  change  will  be 

k *AZ)  '=  Y  Mx)  +  Y  k»i( *)•  (14) 

The  total  running  cost,  associated  with  schedule  u  €  X!  will  be 

c*(*)  Y  c>(x)  (1-5) 

(ye-) 

In  (1.4),  (1-5),  the  symbol  {»  £  1 /}  denotes  the  set  of  all  indices  (from  the  set  {l,2, 
which  are  occupied  by  on  J  in  V  (i.  e.  the  indices  corresponding  to  the  sensors  which  are  on); 
similarly  the  symbol  {t  £  u)  denotes  the  set  of  indices  corresponding  to  sensors  that  are  off. 

Using  the  above  notation  the  available  observations,  under  sensor  schedule  u(-)  are  de¬ 
scribed  by 

<M*.«(0)  :=  Mi(0,u(0)dt  +  r(u(t))dv(t),  (1.6) 

where  it  is  apparent  that  the  available  observations  depend  explicitly  on  the  sensor  schedule 
«(•).  In  (1.6),  for  x  £  1R",  1/  €  Xf, 


h{z,u)  := 


fc,(*)xw(l) 

a‘(x)xm(0 


(1.7) 


l  fcW(*)X(*}(M) 

a  block  column  vector,  where  in  standard  notation 

"**  position  in  the  word  v  is  occupied  by  an 


'  (  0,  otherwise 


'} 


(1.8) 


Similarly  for  v  €  Xf 


57 


r[v)  :=  Block  diagonal{/^1,/,X(„)(i-)}t 
where  R,  are  the  symmetric,  poeitive  matrices  defined  above.  Finally 

■'(0 

1/(0  :=  : 

J 

is  a  higher  dimensional  standard  Wiener  process.  In  view  of  (1-7),  for  all  1/  £  M 


while 


where 


(10) 


(1.10) 


h{  ,v)  :  JRn  -  mD , 

(1.11) 

r(i/)  :  mP  -*  mD, 

(1.12) 

2?  =  di  +  dj  H - +  d*#. 

(1.13) 

and 


To  make  the  notation  clearer,  consider  the  case  M  =  2,  N  =  4.  Then  M  =  {00,01, 10, 11} 


h(z,  00)  = 

h(z,0l)  = 
h(z,  10)  = 
h(x,ll)  = 


:] 

0 

**(*) 

»(*) 

0 

h'(x) 

h*(z) 


(1.14) 


while 


r(00)  - 
r(l0)  = 
r(01)  = 
r(ll)  = 


0  0 
0  0 

R\ fi  0 
0  0 

0  0 
0  R\'7 

r\'7  0 

0  Rlfi 

Clearly  the  dimension  of  the  range  space  of  y(-,  v)  is 

D»  YL  4  Xf^)(0* 

t=i 


(1.15) 


(1.16) 
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Of  course  for  all  v,  y(t,i/)  £  BtD. 

Following  established  terminology  (c.f.  (9])  we  see  that  a  sensor  scheduling  strategy  is  de¬ 
fined  by  an  increasing  sequence  of  switching  times  r,  £  [0,7']  and  the  corresponding  sequence 
Uj  £  M  of  sensor  activation  configurations.  We  shall  denote  such  a  strategy  by  «(•),  where 

«*(0  =  t  €  fry i r^+ 1 ) ;  j  —  1,2,...  (1.17) 

As  stated  earlier  we  are  interested  in  the  simultaneous  minimization  of  costs  due  to  es¬ 
timation  errors  as  well  as  sensor  scheduling.  We  shall  therefore  consider  joint  estimation 
and  sensor  scheduling  strategies.  Such  a  strategy  consists  of  two  parts:  the  sensor  schedul¬ 
ing  strategy  u  (see  (1-17))  and  the  estimator  d-  The  set  of  admissible  strategies  Uu  is  the 
customary  set  of  strategies  adapted  to  the  sequence  of  o-algebras 

^•4»:=s{y(it«(.)),i<l>.  (1.18) 

That  is,  we  consider  strict  sense  admissible  controls  in  the  sense  of  [4].  For  the  problem 
under  investigation  this  last  statement  must  be  interpreted  very  carefully.  First,  we  have 
indicated  in  (1.18),  that  the  available  past  observation  data  information  o-algebra  depends 
(as  is  evident  from  (1.6)  -  (1.9))  very  strongly  on  the  sensor  schedule  u(-).  This  dependence  is 
non-standard,  as  here  the  dimension  of  the  observation  vector  and  the  noise  covariance  change 
drastically  at  each  switching  time  u.  In  standard  stochastic  control  formulations  [4],  [5j,  the 
dependence  of  y  on  u(-)  is  much  more  implicit.  This  is  a  difficult  part  of  the  formulation 
here,  since  it  prevents  us  from  using  Girsanov  transformations  in  a  straightforward  manner. 
Secondly  (1.18)  means  that  the  switching  times  r,  and  the  variables  t/„  which  define  u(), 
must  be  adapted  to  the  filtration  which  depends  essentially  on  the  values  of  r,  and 

l/J  Finally  (1.18)  also  means  that  4>{T)  must  be  measurable  with  respect  to  7^  We 

shall  describe  a  rigorous  mathematical  construction  of  such  a  model  in  section  2. 

Given  such  a  strategy  the  corresponding  cost  is 

J(u(-),*)  :=  £{|*(z(7'))-*(7')|’ 

+  fo  e(x(t),u(t))df 

J* 

Here  for  x  6  IRn>  M 

c{z,v)  :=  c„(x),  (1.22) 

(c.f.  Eq.  (1.5)),  and 

k(z,  i/,i/)  =  k„^(x),  (1.23) 

(c.f.  Eq.  (1.4)). 

The  optimal  sensor  scheduling  in  nonlinear  filtering  is  thus  formulated  as  the  determina¬ 
tion  of  a  strategy  achieving 

inf  J(«(  )>)  (1.24) 

Mb* 


(1.19) 

(1.20) 
(1.21) 
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among  all  admissible  strategies. 

To  simplify  the  notation  a  little,  let  us  order  the  elements  of  )J  according  to  the  numbers 
they  represent  in  binary  form.  For  example  in  the  case  M  =  2,  N  =  4  we  replace  M  — 
{00,01, 10, 11}  by  the  set  of  integers  (1, 2,  3,  4).  That  is  the  one-one  correspondence  between 
M  and  {l,2,...,N}  is  described  by 

v  i — ►  (integer  represented  by  v)  +  1  (1.25) 

k  • — *  binary  representation  of  (k  —  1). 

So  in  the  sequel  of  the  paper  we  replace  all  the  i/,  i/  in  equations  (1.4)  -  (1.23)  by  the 
corresponding  integers  from  {1,  2,  ....  N  }. 

The  structure  of  the  paper  is  as  follows.  In  section  2  a  precise  mathematical  formulation  is 
given  and  the  corresponding  stochastic  control  problem  is  precisely  defined.  In  section  3  the 
set  of  quasi-variationa!  inequalities  solving  the  problem  is  derived.  In  section  4  we  offer  some 
comments  and  discussion  for  extensions,  further  developments  and  computational  methods. 

S.2  The  Stochastic  Control  Formulation 


S.2.1  Setting  of  the  model 

Let  (V.,A,P)  be  a  complete  probability  space,  on  which  a  filtration  7t  is  given,  /  = 

Let  u>()  and  *(•)  be  two  independent,  standard  7i-Wiener  processes  with  values  in  JRn  and 
JRD  respectively,  carried  by  this  probability  space.  On  the  same  space  we  consider  also  an 
2Rn-valued  random  variable  £,  independent  of  u>(-),a(-),  and  with  probability  distribution 
function  ir0. 

We  consider  the  Ito  equation  (1.1),  where  /(•)  is  JK"-valued,  bounded  and  Lipschitz,  while 
P(-)  is  2Rn*n-valued,  bounded  and  Lipschitz.  Letting  a  =  we  assume  a  >  aln,  where 

a  >  0  and  /„  is  the  n  X  n  identity  matrix.  The  Lipschitz  property  is  unnecessary  and  can  be 
easily  removed  using  Girsanov’s  transformation  (i.e.  consider  weak  solutions  of  (1-1))  [8].  It 
is  assumed  here  to  simplify  the  technicalities  not  related  with  the  main  issues  of  the  paper. 
Under  these  assumptions  (l.l)  has  a  strong  solution  with  well  known  properties  [8].  Note 
that  under  P,  z(  )  is  independent  oj  x(-). 

Consider  next  functions  h'(-),  i  =  1  ,...,M,  from  IRn  into  JR*'-,  which  are  bounded  and 
Holder  continuous.  We  shall  denote  by  L  the  infinitesimal  generator  of  the  Markov  process 

ij=  l  vxtax} 

or  in  divergence  form 


t=i 


dx. 


(2.1) 


where 


MC|  dx*  °xi  ic| 


-M  :=-/<«  + 

/•I 


(21  a) 


(2.16) 
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Let  us  next  consider  an  impulsive  control  defined  as  follows.  There  is  a  sequence  Ti  < 
fi . . .  <  r*  <  . . .  of  increasing  ^-stopping  times.  To  each  time  r,  we  attach  an  7, .'measurable 
random  variable  with  values  in  the  set  of  integers  {  1,  2 . N}  *.  We  define 

» (0  =  *<*<  *♦«.  »  =  0, 1,2, . . .  (2.2) 

and  set  r0  =  0.  We  require  that 

r,  T  T  as  I'  T  »,  (2.3) 

while  r*  =  T  is  possible  for  some  finite  k. 

Let  u,  be  the  element  of  JV,  corresponding  to  u;  via  (1.25). 

Then  define 

:=  M*.**).  *■.•<<<*•<+!,  (2<) 

where  A(x,i/)  is  defined  by  (1.7),  in  terms  of  the  given  functions  h'(-).  Clearly  h(-,u(t))  maps 
lRn  into  1RD  for  all  sensor  schedules  «(•)  and  is  obviously  bounded  and  Holder  continuous  in 
x.  Define  also 

r(u(<))  :=  r(^),  U<t<  r<+1,  (2.5) 

where  r(-)  is  defined  by  (1.9),  in  terms  of  the  given  matrices  /?,,  i  =  1,2, ...,M.  Clearly 
r(u(l))  maps  IRD  into  1RD  for  all  sensor  schedules  «(•)  but  it  is  singular.  Next  we  define 
h[x,i/)  to  be  the  vector  valued  function 


h(x,u)  := 


with  X  {*){*}  defined  as  in  (1.8).  Let 


rt*)xMW 


(2.6) 


:=  M*.*'*).  ri<t  <  r,+ 1 .  (2.7) 

Clearly  h(-,u(f))  maps  2R”  into  JR°  for  all  sensor  schedules  u(  )  and  is  obviously  bounded 
and  Holder  continuous  in  x.  We  shall  refer  to  u(-)  as  the  impulsive  control.  As  wc  shall  see,  it 
describes  essentially  the  decision  to  select  at  a  sequence  of  decision  times  one  of  the  functions 
h(-,k),  k  €  (1,2, . . . ,  N).  This  is  the  precise  mathematical  implementation  of  the  sensor 
selection  decision  described  in  the  introduction. 

To  see  that  indeed  this  is  the  case,  we  can,  with  the  above  preparation,  use  Girsanov’s 
measure  transformation  method.  Let  us  then  consider  the  process 


f(0  =  exP {Jo  M ~\jQ  IIM*(*).«(«))H,<fc}  (2-8) 


'Recall  that  N  —  2*  and  the  binary  representation  of  each  integer  1,  2 . N  determines  a  sensor  activation 

configuration  by  (i.26). 
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where  T  denotes  transpose,  ||  •  ||  is  the  JRD  norm.  Note  that  the  process  u(t)  is  adapted  to  7|. 
Then  since  *(•)  is  adapted  to  7,®  C  7  and  «(*)  is  cadlag  [8],  (2.8)  is  well  defined.  Moreover 
since  k  is  bounded,  by  Girsanov’s  theorem  |8],  (14),  f(-)  is  an  7-martingale.  We  can  thus 
define  a  change  of  probability  measure 

and  consider  the  process 

v[t)  =  *(t)  -  Jo  h(x(s),u(s))ds.  (2.10) 

By  Girsanov’s  theorem  |8],  (14),  under  the  probability  measure  }  on  (f),^),  v(  )  is  a 
standard  7- Wiener  process  with  values  in  JR°.  Furthermore,  by  the  independence  of  w(-) 
and  *(•),  te()  remains  a  standard  IRn-valued,  7, -Wiener  process  which  is  independent  of 
v(-).  Finally  £  remains  independent  of  «>(•),  v(-)  while  keeping  its  probability  law,  denoted 
by  ir0-  Thus  x(-)  also  retains  its  probability  law  under  PU^K 

To  relate  this  construction,  i.e.  (2.2)  -  (2.10)  with  the  M  noisy  observations  (sensors) 
loosely  described  in  the  introduction  (c.f.  in  particular  eq.  (1.6)),  observe  that  (2.10)  can  be 
written  as 

r(u(t))dz(t)  =  h(x(t),u(t))dl  +  r(u(t))dv(t )  (2.11) 

in  view  of  (1.7),  (1.9),  (2.4),  (2.5),  (2.6)  and  (2.7).  Indeed 


*»,/,X{e.}(l)  0  0 

Ri  1/V(x)x{l,i}(i) 

r(u(t))h(x,(u(t)))  = 

o  *J/*X{n>(2)  o 

fi2",/V(x)x{p,}(2) 

.  o  0  R'm  . 

.  RM1,7hM(x)XMiM)  . 

=  h(x,i/i),  r,<t<ri+I.  (212) 

To  give  a  precise  meaning  to  (1.2),  or  (1.6),  let  us  introduce  the  process 

y(t,u(t))  :=  y "  (l),  r.  <  t  <  r.+  1  (2.13) 

where 


dj/K(t)  :=  r(i/,)dr(t)  =  h(x(t),t',)  dt  +  r(^,)dv(l).  (2.14) 

It  is  clear  that  if  we  select  u(<)  =  i/,  Vt,  where  v  has  0  everywhere  except  for  one  1  in  the 
Ith  location,  then  (1.2)  results.  It  is  also  rather  plain  that  y*'(l)  6  JR0''  and  that  in  this  case 
the  Wiener  process  r(i/)v()  is  also  Dp-dimensional  (*«*  (116)  f°r  the  definition  of  D„).  The 
process  y**« (t)  represents  exactly  the  observation  which  is  available  in  |r,,b+i). 
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The  next  issue  that  we  wish  to  clarify  relates  to  the  measurability  question  that  we 
discussed  in  section  1.2,  after  eq.  (1.18).  For  any  «(■),  given  the  construction  of  y(-,  «(•)). 
above  we  can  now  consider  7^ ^  as  defined  by  (1.18).  We  shall  say  that  «(•)  is  admissible, 
denoted  u  £  U^,  if  u(t)  is  •“( ))  measurable,  /  >  0,  where  7*^’"^  is  constructed  as  above. 
This  more  precisely  means  that  the  r,  are  7^  ^-stopping  times  or  that 

(r,  <  t)  C  7,'{M))  (2.15) 

and  that 

i/.  €  7?'u{)).  (2.16) 

Note  that  since  q  Jt  for  any  sensor  schedule  «(•)  adapted  to  7j^',u^  if  r,  are  7^  ,u* 

stopping  times  they  are  also  ^-stopping  times,  and  the  above  construction  (2.8)  -  (2.14)  is 
still  valid.  The  implication  of  (2.15),  (2.16)  is  that  one  should  check  that  an  optimizing 
strategy,  obtained  by  some  procedure,  must  satisfy  the  admissibility  conditions.  Clearly  Uai  is 
nonempty  as  strategies  u(t)  =  v,  t  €  \0,T\,  obviously  are  admissible.  Also  strategies  with 
fixed  switchings  are  also  admissible.  Note  that  for  an  admissible  control  7,^  C  7‘. 

We  have  thus  established  in  this  section  the  precise  mathematical  models  of  nonlinear 
filtering  problems  where  selection  of  sensors  is  possible.  In  particular  we  have  succeeded 
in  circumventing  the  subtleties  associated  with  the  definition  of  admissible  sensor  schedules 
discussed  in  section  1.2.1*) 

3.2. 2  The  optimisation  problem 

For  the  dynamical  system  described  in  2.1,  we  consider  now  the  cost  functional  (1.19) 
where  the  underlying  probability  measure  is  P“0.  As  indicated  in  the  introduction,  the 
general  problem  where  the  function  <f>  will  be  in  a  nice  class,  e.g.,  bounded  C *,  or  polynomial, 
or  C°°  can  be  treated  along  identical  lines.  To  simplify  the  notation  we  have  chosen  to 
formulate  the  problem  for  d>(z)  =  *•  The  technical  difficulties  for  this  case  are  identical  to 
the  ones  in  the  more  general  cases  discussed  above,  particularly  since  this  <f>{ •)  is  unbounded 
on  IRn.  For  this  choice  the  selection  of  the  optimal  estimator  4>(T)  is  the  conditional  mean 

4>[T)  =  Eu(  >{x(r)  |  J/(  “(  ))},  (2.17) 

where  Eu^  denotes  expectation  with  respect  to  PU^K  Let  p(u,t)  denote  the  conditional 
probability  measure  of  i (t),  given  7,^  'u^\  on  JRn.  It  is  convenient  to  express  (2.17)  as  a 
vector  valued  functional  of  /t(u,f) 

4>{T)  =  ♦(p(tt,T))  =  /  i  dp(u,T).  (2.18) 

JR" 

We  shall  further  assume  that  the  running  and  switching  cost  functions  c,(  ),k,, (•), 

»,  j  £  (1,...,N),  introduced  in  (1.4)  and  (1.5)  have  the  following  regularity 

«:,(•),  k,-y (•)  are  in  Cs(Bln)  (i.  e.  bounded  and  continuous)  (2.19) 

*Sincc  r(u(<))  it  a  singular  matrix,  this  stags  is  more  delicate  than  in  standard  stochastic  control  theory,  where 
J ‘  would  suffice 
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As  a  result  of  this  simple  transformation  we  can  rewrite  the  cod  as  a  function  of  the 
impulsive  control  u(-)  only  (i.e.  the  selection  of  ^(-)  has  been  eliminated): 

j(u())  =  r-<){||*(r)-*(p(«,r))||1  +  /rc(i(t),u(0)«ft 

*0 

+  H  «(fi-i).*(ri))Xri<r)t  (2-20) 

>=> 

where  Xt.ct  is  the  characteristic  function  of  the  fl-set  (u>;  T;(w)  <  T }.  We  further  assume 
that  the  switching  costs  are  uniformly  bounded  below 

k[x,i,j)  >k„  x  €  2R",  »,  j  €  {1 . N}  (2.21) 

with  k0  a  positive  constant.  Note  that  as  a  consequence  of  (2.20)  if  for  some  admissible  u() 
with  positive  probability,  the  number  of  times  r,  <  T  is  infinite,  then  the  cost  will 

be  infinite.  Therefore  for  T  finite  the  optimal  policy  will  exhibit  a  finite  number  of  sensor 
switchings. 

The  optimal  sensor  selection  problem  can  now  be  stated  precisely  as  the  optimization 
problem 


P  :  Find  an  admissible  impulsive  control  u*()  such  that 

J(u*(-))=  inf  J(u()),  (2.22) 

where  Uad  are  all  impulsive  control  strategies  adapted  to  •“(  )),  or  equivalently  satisfying 
(2.15),  (2.16).  Problem  P  is  a  non-standard  stochastic  control  problem  of  a  partially  observed 
diffusion. 

3.2.3  The  equivalent  fully  observed  problem 

In  this  section  we  transform  the  problem  of  section  2.2,  to  a  fully  observed  stochastic 
control  problem,  by  introducing  appropriate  Zakai  equations.  As  is  customary  in  the  theory 
of  nonlinear  filtering  [l],  (2],  |3),  (4),  let  us  introduce  the  operator 

P(«('),0W  =  E{((m*(t))  I  lU(  ”}  (2.23) 

for  each  impulsive  control  u(-).  The  notation  is  chosen  so  as  to  emphasize  the  dependence  on 
u(  ),  which  is  due  to  the  dependence  of  f(-)  on  «(•)  as  introduced  in  eq.  (2.8).*  The  operator 
(2.23)  maps  the  set  of  Bore!  bounded  functions  on  JRn,  into  the  set  of  real  valued  stochastic 
processes  adapted  to  Note  that  p(u(-),t)  can  be  viewed  as  a  positive  finite  measure 

on  JR".  It  is  the  unnormalized  conditional  probability  measure  of  x(f)  given  |lj,  |2). 


9 But  the  expectation  U  with  reipect  to  P  end  aot  h 


With  the  help  of  these  measures  we  can  rewrite  the  various  cost  terms  in  (2.20)  as  follows: 

r“«){||x(T)-*(/i(u,r))||*}  =  £{f(r)||x(r)-*(/i(«.r))||*> 


=  £{p(u(),T)(*)}, 

(2.24) 

where 

•=  ||x  II* 

W  11  p(u(‘)>T)(1)  * 

(2.25) 

with  x  representing  the  function  x(x)  :=  x  and  1  the  function  l(x) 
straightforward  computation  implies  that 

:=  1,  x  €  ST.  A 

£u(,{||*m  -  *(M(«.r))||*>  =  £{*(p(«(.),T))} 

(2.26) 

where  if  is  the  functional  on  finite  measures  on  JRn  defined  by 

*M  =  e(x’)  - 

(2.27) 

where  x’f1)  =  ||x||*,  x  6  JR",  and  (i  is  any  finite  measure  on  JR"  such  that  the  quantities 
/i(xz)  and  /x(x)  make  sense. 

Next 


s-'-’ljf  *(*(<).  »(<))<«)  =  e1s{t)1T'W)MW} 

=  *{£  sfr  px*W, «(<))  ir.}<e} 

=  B{J*EMT)  |*}«W0.«(<))*} 

=  (2  28) 

because  x(t),u(f)  are  measurable  with  respect  to  and  f(-)  is  an  ^-martingale.  Now  define 
a  map  C  with  values  in  C*(JR")  via 

C(uj)  :=  Cu,(  ),  €  (1,2 . N}.  (2.29) 

Then  in  view  of  (2.29),  (2.23),  we  can  rewrite  (2.28)  as 

-  £(/%(«().l)(<7(«(l))*}.  (2.30) 

Finally 

r“{>{*(*(ri),*(n-i)»*(n))xr4<T>  «  £{r(n)*(s(q)»«(n-i).«(*k))x«k<r> 

-  JE{£{f(n)*WrO,*(n.1),*(r<))Xe<<r 

-  JE{p(«(),f<)(^(«(r<-i).*(r<)))Xr<<r).  (2.S1) 
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Here  wc  have  introduced  the  function  K  with  values  in  C*(JR"),  via 

=  **•»  €  {l,2,...,iV}i  (2*82) 

and  we  utilized  the  admissibility  of  u(-).  Note  that  in  the  simpler  case  where  c,(),ky(-), 
*,  / €  {1,2,...,N}  are  constant  independent  of  z,  (2.30)  simplifies  to 

c(x[t)Mt))dt}  =  E{f\(u{),t)(R)c4t)dt}  (2.33) 

and  (2.31)  simplifies  to 

E“<){A:(i(r,),u(rt_I),u(r,))x,.<r}  =  ^{kB..I^Xr.<rp(«(),r,)(m)}.  (2.34) 

Utilizing  (2.26),  (2.30),  (2.31)  we  can  rewrite  the  cost  corresponding  to  policy  «(■),  given 
in  (2.20),  as  follows 

■'(«(■))  =  e{*IpM),t))  +  £  rM).t)lcMt)))dt 

+  X)f,(“()*r<)(jK‘K-i,«<))Xr.<r}-  (2.35) 

i=i 

In  (2.35)  we  have  succeeded  in  displaying  the  cost  as  a  functional  of  the  unnormalized  con¬ 
ditional  measure  p(u(-),-)  which  is  the  “information"  state  of  the  equivalent  fully  observed 
stochastic  control  problem.  To  complete  this  transformation  we  need  to  derive  the  evolution 
equation  for  p(u(-),-),  i.e.  the  Zakai  equation.  We  turn  into  this  problem  next  and  derive  a 
weak  form  of  the  Zakai  equation  for  ?(«(•),•)  in  the  following  lemma.  Here  C*’1  denotes  the 
space  of  all  functions  x,t )  on  IRn  X  JR  which  are  bounded,  continuous  together  with  their 
first  and  second  derivatives  with  respect  to  z,  and  first  derivatives  with  respect  to  t. 

Lemma  2.1:  For  any  if  €  C]'1  we  have  the  relation 

?(«(•). 0(^(0)  =  *oW0))  +  fo  +  L^ds 

+  f0  I^P(“().*)(^*(u(s))^(5))dA(s)  (236) 

where 

[//,(u(s))^]  (z)  :=  h,(z,u(s))^(z),  i=l,2 . D,  <t>  €  Cf 

V»(s)(z)  :=  ^(z,s),  (2.37) 

and  h,  is  the  itk  component  of  h  (see  (2.6)). 

Proof-. 

Let  /?(•)  6  L°°(0,T\  UlD)  given  and  consider  the  7, -martingale  /»(<),  defined  by 

dp{t)  =  p{t)0{t)Tdz(t),  p{0)  =  1.  (2.38) 
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Recall  that  by  definition  of  ((f)  (c.f.  eq.  (2.8)) 

=  f(0M*(<),u(f))rd*(f),  C(0)  =  1.  (2.39) 

Therefore  by  Ito's  rule  |8] 

=  f(0p(0l(Mx(f),«(<))+/?(<))Tdr(0 
f(0)/>(0)  =  1,  (2.40) 

and  since  *l>  €  C*'1 

«W*(0.0  =  +ty(*(t)  ,t))di 

+  |W(*(0.0]Tf(*(0)<M0.  (2.41) 

where  L  is  given  in  (2.1).  Therefore  suppressing  some  arguments  for  ease  of  notation 

«W(*(*).0  f(*M01  =  f(«M0l(§£  +  ^  +  *rW« 

+  Vll>Tsdw(t)  +  i/>(h  +  0)Tdz(t)}.  (2.42) 

In  (2.41),  (2.42)  we  used  the  notation  =  (££.•••  »££)T-  Integrating  (2.42),  and  taking 
expectations  we  deduce 

W(x(<),0f(0/>(0}  =  *o(0(O))  +  E{Jo  f (5)^(5)^  +  Lxl>  +  h.T0xl>\ds}.  (2.43) 

We  can  then  write 

Ei£  S(s)p[s)\^  +  L4,}ds}  =  E{J‘E{p(s)((S)(^+L^)\r^}d^ 

=  E{fo  />(<®)p(«(*)» *)(^j  +  Lil>)ds) 

=  E{p{t)  f^p[u[  ),s)[j^  +  Lil>)ds}  (2.44) 

by  virtue  of  the  7(-martingale  property  of  p[-).  Similarly 

Eifo  C('')/,(s)i(*(s):U(s))T0(s)V'(z(*),*)ds} 

=  £{p(l)  Jo  i{s)tl>{z[s),s)h{x{s),u{s))Tdz{s)} 

=  £{p[t)  I  2T #»(«().  **W)tf(-.*))<Ma)},  (2.45) 

0  •*! 

where  in  the  first  equality  we  have  used  the  representation  p(t)  =  1  +  /„'  p[»)0(t)Tdz{t),  and 
the  well  known  isomorphism  between  Ito  stochastic  integrals  and  L*  (8j.  Finally 

£{*(*(<).<)f(<M<)}  =  F{p(l)p(«(),t)(i(t))}.  (2.46) 
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Using  (2.44),  (2.45),  (2.46)  in  (2.43)  we  obtain 

£{p(0[p(“(-). 0M0)  -  *o$(0))  -  p(«(-).«)(^j  +  Ltyda 

~  >(*(-).«)(ft(«W)^W)^-W])  =  O.  (2.47) 

J0  |'sl 

We  can  replace  in  (2.47)  p(t)  by  a  linear  combination  of  such  variables,  with  different  {J. 
The  set  of  corresponding  variables  is  dense  in  L7(Cl,7tM,P).  However,  the  random  variable  in 
the  brackets  in  the  right  hand  side  of  (2.47)  is  clearly  in  L7[{\t7^  M^\P)  and  therefore  in 
L*(n,/,*,P)  since  c  7‘.  Then  (2.47)  implies  the  result  of  the  lemma  (2.36). 

As  a  remark  we  would  like  to  note  that  the  assumed  nondegeneracy  of  z(-),  implies  that 
the  solution  of  (2.36)  is  unique.  This  can  be  proved  in  general  under  our  working  hypotheses, 
for  solutions  which  are  measure-valued  processes.  Here  we  outline  such  a  proof  for  the  case 
when  these  conditional  measures  are  absolutely  continuous  with  respect  to  Lebesgue  measure 
on  JRn\  i.e.,  in  the  case  unnormalized  conditional  densities  exist.  For  this  we  need  to  assume 
in  addition  that 


7r0  has  a  density  p0  with  respect  to  Lebesgue  measure ;  po  €  L7(lRn)  (2.48) 
Let  us  denote  by  L'  the  formal  adjoint  of  L  (see  (2.1),  (2.1a),  (2.1b)): 


n  o  o  "  ^ 

V  -  Y  —a,, lx)-—  +  ir  —a,, 
and  consider  the  Hilbert  space  form  of  the  Zakai  equation  [10] 


dp  =  L'pdt  -f  ph{-,u{t))Tdz(t) 
P[  0)  =  Po- 


The  function  space  in  which  the  solution  is  sought  is 


(2.49) 


(2.50) 


L\ n,  A,  P ;  C( 0,  T ;  L*(2R")))  n  L7 „  „  (0, T;  Hx(IRn))  (2.51) 

Here  //'  is  the  usual  Sobolev  space  on  JRn  [  1 1  ]  and  the  subindex  •"(  b  in  the  second  L 7 
space,  denotes  that  the  solution  is  adapted  to  the  filtration  )•**(  )),  t  >  q  f0u0WS  fr0m 
the  results  of  E.  Pardoux  (ll),  that  there  exists  a  unique  solution  of  (2.49)  in  the  function 
space  (2.50),  under  the  assumptions  made  here.  We  can  then  establish  the  following. 
Lemma  2.2:  The  following  property  holds 


!»(*(■).  W)  =  M«(-).*).*).  (2.52) 

in  L7(ffin)  and  bounded,  where  (•,•)  denotes  inner  product  in  L*(2Rn). 

Proof : 

By  slight  abuse  of  notation  we  use  the  same  symbol  to  denote  the  conditional  unormalized 
measure  and  density  (whenever  the  latter  exists).  Let  us  prove  inductively  that 
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p(u(  ),r<  V  (<  A  ri+i ))(^)  =  (p(«(  ),T,  V  (t  A  r,+1)),V'),  (2.53) 

where  the  left  hand  side  notation  refers  to  the  measure  appearing  in  (2.36),  while  the  right 
hand  side  notation  to  the  solution  of  (2.50),  which  is  uniquely  defined.  Suppose  then  that 
(2.53)  holds  for  i-1,  and  therefore  in  particular 

p(u().Ti)(tA)  =  (p(w(-).n),^),W>.  (2.54) 


Consider  now  the  solution  17  of 


+  Lt]  =  -rjh{  tu{s))Tp[s),  «  €  (r,,r,  V  (l  Af,j+,)) 


«l(i»n  v(t  Ari+1))  =  V»(i)  (2.55) 

where  ^  €  Co°(IRn)  and  (J  is  a  smooth  deterministic  function  with  values  in  IRD.  From  the 
assumptions  on  f,  g  and  h'  (it  is  here  that  we  use  the  assumed  Holder  continuity  of  h'),  we 
can  assert  that  the  solution  of  (2.55)  belongs  to  C/,I(2Rr“  X  (r,,r,  V  (f  A  r,  +  ,))),  for  any  sample 
u>,  (ll).  Therefore  (2.36)  implies  (using  (2.55)) 

?(“(*), T*  v  (<  A  r,+  1))(V0  =  !>(«(•), t, )(*?(»;)) 

/•r.v(«Ar1+1)  D  . _ 


Jr>  ,  =  1 

+  /  >£pMO.«)(tfi(«(*))»iW)«M*).  (2.S6) 

Jt'  ;=> 


where  H,  is  as  defined  in  lemma  2.1,  and  q(s)(z)  :=  rj[x,s).  Therefore  by  Ito’s  rule 
p(u(-),T;  V  ((  A  r,M))(V>)p(r,  V  (/  A  r,+  1))  =  p(u(),  r,)(f?  (r,))p(T,) 

/r,v(«Af.+  l)  D 

I  p(s)5Zp(u()'s)(;/j(u(5))^(s)) 

>= 1 

/•r.v{(Ar,  +  I)  jP 


p(5)Z1p(u().5)(^(s))^(5)J^(s)- 
»  =  1 


(2.57) 


Hence 


£{?(«(•)» T*  v  ((AVi))W^,  V  (t  A  T,+  I))} 
=  ^{p(u().r.)(^(T.))p(T.)}- 
On  the  other  hand  from  (2.50)  and  (2.55)  we  obtain 


(2.58) 


(p(u(  ),t,  V  (t  At,+1)),^)  =  (p(u(  ),r,),q(T,)) 

/r,v(lAr,+  I)  £ 

5I(p(«(-).*)M-»«(*)M(*))rf*jM 

*  >e| 

/r,v(lAr,+1)  £ 

jL,(p(u()>‘)Jli('‘(*))'H«)Pi(*)ds'  (25t>) 

•  >*=> 
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and  thus  also 


■E{(p(«().t.  v  (<  a  n+iM.VMn  v  (< A  n+i)  =  ^{(p(«().n).n(n))p(n).  (2.eo) 

But  from  the  inductive  hypothesis  (2.54),  the  right  hand  sides  of  (2.58)  and  (2.60)  are  equal. 
Hence  the  left  hand  sides  coincide.  Varying  /J,  we  easily  deduce  that  (2.53)  holds,  at  least  for 
tb  €  Co°(jRn),  which  is  sufficient  to  conclude  the  proof  of  the  lemma. 

With  this  result  we  can  rewrite  the  cost  (2.35)  as  follows 

J(u(.))  =  £{*(p(u(.),r))  +  /T(p(u(.),t),C(u(t)))dt 

Jo 

+  ICXfi<r(p(#(,).f.).%-i(«i))}  (2-61) 

«=i 

where  (see  (2.27)) 

*W»(-).r)) »  0>M  ).nx')  -  (2.62) 

Since  the  expression  (2.62)  involves  unbounded  functions  we  have  to  show  that  it  makes  sense. 

At  this  point  it  is  useful  to  introduce  a  weighted  Hilbert  space  in  order  to  express 
¥(p(u(),r))  in  a  more  convenient  form.  To  this  end  let 

m(t)  =  1-f  |(  x  ||4  (2-63) 

and  L3(JRn,p)  denotes  the  space  of  functions  <p  such  that  ipfi  6  L7(JRn).  Define  in  a  similar 
way  the  space  L1[JRn\p).  From  the  discussion  of  existence  and  uniqueness  of  solutions  of 
(2.50)  in  the  functional  space  (2.51)  and  if 

po  €  L7(JRn,ft)  n  Ll[IRn,n), 

it  is  easy  to  check  that  (2.50),  under  the  assumptions  made  in  section  2.1,  has  a  unique 
solution  in  the  6pace 

L’(n,  A,  P;C( 0, T\  L7{mni p)  n  V (2Rn; M)))  n  L7( 0, T\  Hl(JRn ; p))  (2.64) 

where  H1(IRn\n)  is  the  obvious  modification  of  This  justifies  that  the  quantities 

arising  in  (2.62)  have  a  meaning. 

We  note  that  J(u(  ))  is  indexed  implicitly  (we  do  not  include  this  in  our  notation)  by 
w0  (or  po)  and  u(0)  =  j,  j  6  (1, . . . ,  N)  which  is  determistic  since  it  is  ^'-measurable,  by 
construction. 

Wo  close  this  section  by  rewriting  the  dynamics  (2.50),  in  terms  of  the  originally  given 
observation  nonlinearities  h*,  and  with  forcing  inputs  the  processes  y*(-)  introduced  in  (2.13), 
(2.14).  In  view  of  (2.5),  (2.6),  (2.7),  (2.13),  (2.14)  we  have 

i(-,«(i))r*(o  <  <  <  6.. 
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(where  we  have  written  t  —  . . .  **/)T) 

=  r,<t<  r,+I 

>=J 

=  $(-,i',)Tdy(<,t'.),  r,  <  t  <  r,+i 

=:  f(-,«(<))r4y(f.«(0). 

where 

=  firlA‘(I)X(w(«)  (2  65) 

RMlhM(z)XM{M) 

Therefore  the  system  dynamics  (2.50)  can  be  written  equivalently 

dp(u()»0  =  £*?(«(•). 0*  +  p(u().06(-.u(0)Tdy(f»ti()) 

p(u(),0)  =  po,  (2.66) 

where  y(t,u(t))  is  defined  in  (2.13),  (2.14).  This  makes  precise  the  construction  of  a  Zakai 
equation  driven  by  “controlled”  observations  alluded  to  in  the  introduction.  It  also  becomes 
now  clear  that  the  spaces  described  by  (2.5 1),  (2.64)  are  the  appropriate  ones  as  far  as 
solutions  of  (2.50)  or  (2.66)  are  concerned. 

3.3  The  Solution  of  the  Optimization  Problem 

3.3.1  Setting  up  a  system  of  quasi-variational  inequalities 

Let  us  consider  the  Banach  space  H  =  LJ(2Rn;p)  n  L1(2R,';/x)  and  the  metric  space  //* 
of  positive  elements  of  H.  Let 

8  :=  space  of  Borel  measurable,  bounded  functions  on H* 

C  :=  space  of  uniformly  continuous,  bounded  functions  on//+ .  (3.1) 

Let  us  now  define  semigroups  $>y(<)  on  8  or  C  as  follows.  Consider  (2.50)  with  fixed  sched¬ 
ule  u(t)  =  j,  and  let  p;  denote  the  corresponding  density  p(  ,j).  Then  for  j  €  (1,2,...,  A'} 

dp,  =  Up,  dt  +  p,h’Tdz(t),  p,(0)  =  n,  (3.2) 

where 

h’  :=  (3.3) 

We  set 

♦,(f)(F)(*)  =  F{F(p, .,(«))},  F€  8  or  C,  (3.4) 
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where  p,,.  indicates  the  solution  of  (3.2)  with  initial  value  w.  It  is  easy  to  see  that  is  a 
semigroup  since  Pj(t)  is  a  Markov  process  with  values  in  H* .  It  is  atso  useful  to  introduce 
the  subspaces  B\  and  C\  of  functions  such  that 

II HI,  s„r  JjM-  <  00  (3.5) 

»£H*  1  +  INI* 


where  ||  ir  |j>1=jl  x  The  spaces  B\  and  C i  are  also  Banach  spaces.  They  are 

needed,  because  we  shall  encounter  functionals  with  linear  growth  in  the  cost  function  (2.61). 
To  simplify  the  statement  and  analysis  of  the  quasi-variational  inequalities  that  solve  the 
optimization  problem  considered  here,  we  give  the  details  for  the  case  N  =2  only  in  the 
sequel.  We  shall  insert  remarks  to  indicate  how  the  results  should  be  modified  for  the  general 
case.  Let  us  introduce  the  notation 

C.  :=  C(«\),  .  =  1,2, 

A,  :=  A(  1,2) 

K2  :=  A' (2,1).  (3.6) 

Since  C\,C2,  K\,K2  are  bounded  functions,  one  can  utilize  them  to  define  elements  of  C2  via 
(for  example) 

Cl(x)  =  (Cl,x)  (3.7) 

where  a  slight  abuse  of  notation,  in  denoting  the  functional  and  the  function  by  the  same 
symbol,  has  been  allowed.  Similarly  the  functional  on  H* 


*(*)  =  (*,X*)  - 


H(*,x)ll* 

(*.1) 


belongs  to  C\  since  it  is  positive  and 


*(*)  <  [ti, x7)  <  ||»||„. 

Consider  now  the  set  of  functionals  Ui[x,t),  U2{x,i)  such  that 

Ui,U2  G  C(0,T;C,) 

Ui{  ,t)  >  0,  i/,(-,t)>o 

Vt(*,T)  =  U2{x,T)  =  *(») 

{/,(«,£)  <  J‘ 9>i(X-t)C,(x)dX 

U2(x,t)  <  *2{s-t)V2(x,s)  + J' *2{X-t)C2{x)dX 

Vs  >  t 

t/,(ir,t)  <  At(x)  +  </*(*,() 

<  Aj(»r)  +  U2(x,t). 

In  the  sequel  we  will  occasionally  use  the  notation  U,(a){x)  -  Ui(x,g ),  «’  =  1,2. 


(3.10) 
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S.S.2  Existence  of  a  maximum  element 

We  shall  refer  to  (3.10)  as  the  system  of  quasi-variational  inequalities  (QV1).  Our  first 
objective  is  to  prove  the  following. 

Theorem  3.1.  We  assume  that  the  conditions  on  the  data  f,g,h *  introduced  in  section  t.l 
hold.  Then  the  set  of  functionals  UUV ,  satisfying  (S.10)  is  non-empty  and  has  a  maximum 
element,  in  the  sense  that  ifU\,Ui  denotes  this  maximum  element  and  Ui,U«  satisfies  (S.10), 
then 

Ui  >  ultu3  >  ut. 

The  proof  will  be  carried  out  in  several  steps.  In  fact  there  is  some  difficulty  due  to  the 
functional  'l'(jr).  We  shall  modify  it  in  order  to  assume  that 

0  <*(»)<  *(».!)  (3.11) 

where  W  is  a  constant.  We  shall  prove  the  theorem  with  the  additional  assumption  (3.11), 
prove  the  probabilistic  interpretation,  i.e.  the  connection  with  the  infimum  of  (2.61).  The 
probabilistic  formula  will  be  next  used  in  an  approximation  procedure.  We  can  approximate 
for  instance  the  functional  ♦  defined  by  (3.8)  in  the  following  way.  Set 


11/ 


dx\ 


/  irdx 


(3.12) 


which  clearly  satisfies  (3.11)  with  <*  =  n. 

Proof  of  Theorem  3.1  under  the  assumption  (3.11).  The  set  of  functionals  satisfying  (3.10) 
is  a  subset  of  or  Ci  defined  in  (3.5).  However  for  this  subset  the  norm  (3.5)  is  unnecessarily 
restrictive.  For  those  functionals  it  is  sufficient  to  set 


H  =  L,{mn)nL1(mn) 

H+  =  set  of  positive  elements  of  H  (3-13) 

and  to  consider  the  space  of  Bore!  or  continuous  functionals  on  H+  6uch  that 


Mi  =  *u.p 


l**MI 


<  oo 


We  shall  then  study  the  system  (3.10)  with  Ct  replaced  by  Cj.  Let  us  notice  that 


(3.14) 


H+  C  H + 


and  if  one  considers  a  functional  F  in  Si  or  C»,  its  restriction  to  H4  belongs  to  Bj  or  C j;  the 
injection 

F  — »  restriction  of  F  to  H+ 

is  continuous  from  Bi  or  Ci  to  Bi  or  C,.  Therefore  replacing  in  (3.10)  C i  by  C,  gives  a  stronger 
result. 
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We  shall  in  the  proof  omit  to  write  the  symbol  ~  and  write  Bi,Ci  instead  of  Bi,Ci,B 4 
instead  of  H* ,  the  norm  ||  ||i  is  then  given  by  (3.14). 

The  proof  is  then  an  adaptation  of  the  methods  of  Bensoussan-Lions  (9)  to  the  present 
case  in  order  to  take  into  account  the  fact  that  we  use  C i  instead  of  C. 

First  note  that 

IIMOIUk.*.)  <  1  (3-15) 

where  £(C,; C,)  is  the  space  of  linear  continuous  operators  from  C ,  into  itself.  Indeed  we  have 

NfM  =  1£{^W(0)}I 

1  +  (tt,1)  l  +  (»r,l) 

(l  +  E(Pl,.(0,l)) 


<  n 


1  +  (’T*  l) 


since  from  (3.2) 
Therefore 


=  m 

l)  =  (»r,l) 

<  \n 


(3.16) 

(3.17) 

(3.18) 

(3.19) 


which  implies  (3.15). 

Note  also  that  a  solution  of  (3.10)  will  satisfy 

t/i(M)  <  *i (T  -  t)Ut(r,T)  +  ,(A  -  t)C^)d\ 

and  due  to  positivity,  we  also  have 

\mt)h  <  ||t/i(7')|ii  +  110,11,(7  -  0  <  V  -r  lie, ||(T  -  t) 
where  |[C,[|  =  supzC,(i). 

As  it  is  customary  in  the  study  of  QVI  we  begin  with  the  corresponding  obstacle  problem, 
UUU2  €  C(0,7;C,) 
t/,M  >  0,l/s(-,t)  >0 
Vt(*,T)  =  V2(n,T)  =  *(*) 

<  4*,(s  -  t)U,(7T,s)  +  ^  4>,(A  -  t)C,(»r)dA 

V2(*  ,t)  <  *i[s-t)U2{*,s)  +  J‘*2[k -t)C2[*)d\ 

Vs  >  t 

U\(i r,  <)  <  A'i(?r)  +  f,(w,0 
U2(v,t)  <  K2{n)  +  fi(’r,0 


C3.20) 


where  we  assume  that 


fiifj  €  C(0,7;C,) 

fi(*.0>0,  fr(*,0>0 

fi(wi7)tfj(ir,7)  >  «'(*). 


(3.21) 
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We  then  have  the  following. 


Proposition  3.1:  For  ft, ft  as  in  (3.21)  the  »et  satisfying  (3.20)  is  not  empty  and 

has  a  maximum  element. 

It  is  clear  that  for  ft,  ft  given,  the  system  of  inequalities  (3.20)  can  be  decoupled  and 
U\,  Ui  can  be  considered  separately.  Let  us  then  omit  indices  momentarily  and  consider 

U  €  C(0,  T;  Ci) 

U(;t)  >  0 
U(*tT)  =  *(*) 

U(ir,t)  <  4>(s  -  t)U(*,s)  +  J  Q(\-t)C(x)d\ 

Vs  >  t 

t/(rr,t)  <  f(<)  (3  22) 

where  f  stands  for  instance,  for  /fj(rr)  +  ftf^iO-  To  prove  proposition  3.1,  it  suffices  to  show 
that  (3.22)  has  a  maximum  element.  This  can  be  done  by  the  penalty  method.  So  we  look 
for  U(  solving 

U,(t)  =  $(s  —  t)U,{$)  +  J*  $(A  —  t)lC(w)  —  —  f(A))+]dA 

for  t  <  s  <T 
Ut{T)( tt)  =  ¥(*) 

Ut  €  C(0,T-,Ci) 

Ut(;t)  >  0.  (3-23) 

Wc  can  then  assert 

Lemma  3.1  There  is  a  unique  solution  of  (3.23). 

Proof:  Notice  that  (3.23)  is  equivalent  to 

U((t)  =  *(T  -  t)Ut(T)  +  J*  *(A  -  t)\C(n)  -  ^(l/,(A)  -  f(A))+)dA  (3.24) 

and  also  to 

t/,(t)  =  e-f(r-'l4.(r-t)'I'(rr)  +  -  <) 

(C(rr)  +  |t/,(A)  -  |(t/«(A)  -  f(A))+jdA  (3.25) 

Let  us  define  the  transformation  T,  of  C( 0,  T;C i)  into  itself  using  the  right  hand  side  of  (3.25). 
Then  the  latter  can  be  written  as  a  fixed  point  equation 

V,  =  T,U,  (3.26) 

Using  (3.11)  and  (3.15)  one  can  show  precisely  as  in  Bcnsoussan-Lions  {8,  p.488)  that  some 
power  of  T,  is  a  contraction.  Hence  the  result  of  the  lemma  follows. 
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One  then  can  also  prove  as  in  |9,  pp.489  -  490],  that  if  t  <  (',  || L/<  ||  t  <  K,  then  0  <  U,  <  U,<. 
As  in  [9,  pp.494  -  495]  one  then  shows  that  as  <  J  0,U,  J  U  which  is  the  maximum  element  of 
(3.22).  The  convergence  takes  place  in  C(0,T;Ci).  This  establishes  Proposition  3.1. 

We  can  then  proceed  with  the 

Proof  of  Theorem  3.1:  (Continuation) 

Let  us  consider  the  map  H  mapping  C(0,T;Cj)  x  C(0,!T;Ci)  into  itself  defined  by 

(3.27) 

where  the  right  hand  side  represents  the  maximum  element  of  (3.20).  Let  now 
l/f(M)  =  *,(T  -  t)*(x)  +  JT  *,(A-<)C,(ir)rfA 


[/'(*,<)  =  *2{T  -  +  f*  -  t)C,(*)d\ 

(3.28) 

Consider  f,(t),  &(*),»'  = 

1,2  such  that 

0  <  ?.(<)  <  6(0  <  U,°(t),i  =  1,2, 

(3.29) 

and 

&(*)-*(<)  nr€  [0,1]. 

(3.30) 

Then  we  have 

0  <  HUub)  -  //(f>,ft)  <  7(1  ~V)tftf«,6) 

(3.31) 

where 

7'  <  ko 

-  A0+*  +  max(||C,||,  110,11)7 

(3.32) 

Indeed,  setting 

x  =  1  -  7  (1  -  7) 

(3.33) 

we  have  to  prove  that 

«*({„&)<#(  f„fl). 

(3.34) 

Let  us  set 

(l/i,l/*)  =  //(fi.ft) 

(«>»,€>,)  =  //tfi.fc). 

(3.35) 

We  need  then  to  show  that 

ict/j  <  I/j,  kUj  <  l/j. 

(3.36) 

If  we  can  establish  that 

nKi(w)  +  *£j(w,t)  <  Kj(*)  +  ft(*.0 
*tfi(*)  +  *£i(w,l)  <  #*(*)  + C>("tO» 

(3.37) 
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then  (3.3G)  is  implied  by  the  monotonicity  properties  of  Variational  Inequalities.  But 

6(M)U  -i)  < 

hence  it  is  enough  to  establish  that 

*#,(*)  +  «£,(*, t)  <  Kx{x)  +  (1  -  Tf) 

kKj(*)  +  *&(*,*)  <  #,(*)  +  (1  -  7)£i(*.0 

The  first  of  (3.39)  will  be  satisfied  if 

[*  -  (1  -  Tf)l&(*.0  <  (1  -  *)Ki(*) 

or  if 


(3.38) 

(3.39) 


V6(*.0  <  (1  “  V)A'i(a). 


(3.40) 

(341) 


But  observe  that 


6(*,0  <  l/?(*,<)  <  (*  +  \\CiWT)  (w.l) 

So  it  is  enough  to  choose  •>'  so  that 

V(*  +  ||C2|!r)(7r,  1)  <  (1  -  V)fco(rr,  I) 


(3  42) 


where  k0  is  the  uniform  lower  bound  (2.21),  since  A'i^)  >  it0(rr,l).  This  last  inequality 
requires 

fco 


1  ~  *c+*  +  ||C,||7* 

In  an  identical  fashion,  the  second  of  (3.39)  will  be  satisfied  if 


7’  < 


iic,nr 


(3.43) 


(3.44) 


So  both  of  (3.39)  will  be  satisfied  if  we  choose  7'  according  to  (3.32).  The  proof  of  the  theorem 
then  proceeds  via  the  standard  iteration 


(t/r\£?+,)=  H{U?W)  (3.45) 

as  in  (9,  pp.512  -  514). 

Remark.  The  extension  of  this  result  to  the  general  case  A’  /  2  is  straightforward.  The 
system  (3.10)  has  A’  functionals  V Everything  in  (3.10)  is  the  same  except  for  the 
last  two  inequalities  which  are  replaced  by 

U,(n,t)<  min  (A., (»r)  +  (/(*,£)),  t  =  l,...,AT  (3.46) 

i  =  t N 

One  again  introduces  the  system  (3.20)  where  the  last  two  inequalities  are  replaced  by 
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1  =  1 


(3.47) 


IM-.O  <  min  (tf.  j(«)  +  o(*,0), 

t /• 

y=> . n 


»•••» 


where  f,  6  C(0,X;Ci),  and  satisfy  the  remainder  of  (3.21).  One  then  establishes  the  analog 
of  Proposition  3.1  bv  penalization.  The  analog  of  Theorem  3.1  is  established  by  introducing 

a  map  II  mapping  C(0,  T;Ci)n  into  itself  defined  by 


//(f.,fz,...,fyv)  -  (U1,U2...,U„) 

where  the  right  hand  side  is  the  maximum  element  of  the  analog  of  (3.20). 

S.S.3  Existence  of  an  admissible  sensor  schedule 

* 

Our  objective  in  this  section  is  to  show  that  the  maximum  element  [7j,  Uj  of  the  QVI  (3.10) 
provides  the  value  function  for  the  optimization  problem  (2.61),  (2.66)  when  the  assumption 
(3.11)  holds.  Furthermore  we  want  to  show  how  an  admissible  optimal  sensor  schedule  is 
determined  once  the  pair  U2,U2  is  known. 

We  shall  prove  that 


U,(~.  0)=  inf  J(.» (  )),  i  =  1.2  (3.4S) 

n(0)  =  r 

,,(0)=n 


where  x  €  if4  satisfies  (rr,l)  =  1.  An  optimal  schedule  will  be  constructed  as  follows. 
Suppose,  to  fix  ideas  that  i  =  1.  Then  define 


t,*  =  inf  < t  i (/>i (t),f)  =  A'i(Pi(0)  +  U 2(j>i(f),0} 


(3.49) 


where  again  p,(l)  is  the  solution  of  (3.2).  We  write 

P*(0=Pi(0,  «€[0,r;].  (3.50) 

Next  define 

r a*  =  t{^(P2(0.0,=  K7(pA  0)  +  t/|(p?(0.0}  (3.51) 

In  (3.51),  it  must  be  kept  in  mind  that  p2(f)  represents  the  solution  of  (3.2)  with  j=2,  starting 
at  r,*  with  value  Pi(r,‘).  We  then  define 


P*(0=Pr(0.  <€K.TaJ 


Note  that,  unless  t,*  =  T, 


r,  >  r, 


i  • 


(3.52) 


(3.53) 
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otherwise 


tf.M'.V.*)  =  tfi(piK))  +  £Mp«K).'D 

iMp.KW)  =  K,(p»K))  +  i/i(piK)tr;)  (3.54) 

which  is  impossible  since 

*.(p.K))  >  0, /fj(pi(r,*))  >  0  o.s.  (3.55) 

Similarly  one  proceeds  to  construct  a  sequence  of  r,‘  <  r,*  <  rj  <  ...  and  the  process  p*(). 
We  can  then  prove  the  following. 

Theorem  3.2.  With  the  same  assumptions  as  in  Theorem  S.l  and  in  addition  assuming  that 
(S.ll)  holds;  the  sequence  of  stopping  times  defines  an  optimal  admissible  sensor 

schedule. 

Proof:  Considering  (3.10)  as  a  VI  with  obstacle  we  can  write  from  the  definition  of 


Ux(*.  0)  =  mfpifrv;) 

+  f  Cj(pl(A))dA>.  (3.56) 

Jo 

This  can  be  established  by  utilizing  the  penalization  (3.23),  along  similar  lines  as  in  [9,  pp. 
578  -  587].  Then 


=  E{np[T))X,;=T) 

4  £{J/1(p,K),r;)X(.<r}. 

Substituting  back  in  (3.56)  and  using  the  definition  of  rj  in  (3.49)  we  obtain 

=  E  |^'(p*(7’))Xr;=r  +  Ci(p' [X))dX 

+Ki(p’K))x»;<r  +  ^j(p’(ti  )iTi’)Xr;<r} 

Furthermore,  again  by  employing  penalization  one  can  show  that 


Wfp’W.r;)}  =  W(PzKW)}  =  E{l/,(p,(r;ir;) 

+  /  Cj(P2(-^))dA}. 

This  implies 


4 


£{tMP*K).'i*)Xr,-<r 

X.;<r/'*C1(Pa(A))<fA}. 


(3.57) 


(3.58) 


(3.59) 
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Next 


E{Ut  [Pi  (r,‘ ) ,  rj  )Xr;<r  }  =  E{*[p'(T))Xr:<r.r;~T) 

+  £{^(P*K),r,*)x,;<r>. 

Substituting  back  in  (3.57)  and  using  the  definition  of  r,  in  (3.51)  we  obtain 
Ui(*,o)  =  £{*(p*(:r))x,;=r  +  Ki(p'(t;))x,:<t 
+K*iP‘(T;))Xr;<T  +  ['  C,(p*(A))dA 

+  /?  C:(p*(A))dA  +  lMP*te),r;)Xr;<r} 
Proceeding  in  a  similar  fashion,  and  collecting  results  we  can  write: 

*M*.0)  =  £{*(p*(T))X,..r 

+  EWK))x.;< r 

«=1 

+  Y,*rUt<T  I';''  Ci+1{P'(\))d\}, 

+  ^+i(p*(r;),r;)x,.<r 

where  we  used  the  notation 


(p’K 

),< 

)Xr.*<T 

’  Ku 

if  i 

is  odd 

,  k7, 

if  i 

is  even 

Cu 

if  i 

is  odd 

. 

if  i 

is  even 

Ult 

if  i 

is  odd 

vit 

if  i 

is  even. 

(3.60) 


(3.61) 


(3.62) 


However,  observe  that  necessarily  r *  =  T,  for  n  large  enough  (random).  Otherwise  one  has 
r*  <  T,Vn,  on  a  set  Oq  C  0  of  positive  probability.  But  r*  J  r*  <  T  and 


where  (since  (x,l)  =  1) 


(see  (2.66))  and 


(p*(0,i)  — (P*(0,D 


(p*(r*),l)  =  1  +  f  p'6Tdy 
Jo 


(p-(O,l)  =  £{f(r*)|^-*J}>0  a.t. 
where  f(  )  is  the  process  introduced  by  (2.8).  Therefore  on  Flo,  as  n  — *  oo 

53/f<(p'(r,’))Xf;<r  — *  +oo 


(3.63) 

(3.64) 

(3.65) 

(3.66) 
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and  since  fl0  has  positive  probability,  as  n 


'30 


£(E  Ki(p'tt))x.:< t)  —  OO,  (S.67) 

which  contradicts  (3.19). 

We  can  thus  assert  that 

Xt*^t  1  a. a.  (3.68) 

In  particular,  it  follows  that  the  sequence  defines  an  admissible  schedule  denoted 

by  u*.  The  corresponding  state  solution  of  (2.66),  coincides  with  p*  and  (3.61)  implies 

f/i(*,0)  >  J(u*()).  (3.69) 

But  by  standard  arguments,  one  checks  that 

l/.K 0)  <  J(u(  )),  Vu(  )  €  U.d  (3.70) 

and  therefore  «*(•)  js  indeed  optimal. 

3.3.4  The  main  result 

We  want  now  to  get  rid  of  (3.11)  and  consider  the  original  functional  if  in  (3.8).  Let  us 
consider  the  approximation  (3.12)  4»n  of  if.  To  tf**  corresponds  a  system  of  QVI. 

e  C(0,T;Ci) 

u",u ;  >  o 

V?(*,T)  =  t/,>,T)  =  *„(, r) 

^>,0  <  *i{s-t)U?(r,s)  +  J‘^l(\-t)Cl{n)dX 

<  4>2{s~t)U?(r,s)  +  J'<h{X  -l)C2(7r)dX 
Vs  >  ( 

<  K,(*)  +  [/?(*,() 

<  *,(*)  + l/,"  (*,().  (3.71) 

From  Theorem  3.2,  we  can  assert  that 

t/."(^°)=  tt>or>f=t y>(„,  1  =  1,2  (3.72) 

r(o)=* 

where 

J"(u())  =  E{*n(p{u{),T))+ J'T(p(u(.),t),C(u(t)))dl 

OO 

+  HxT)<T(p(u(),r,),if(u,-,,u,))).  (3.73) 
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Therefore  we  deduce  that 


J"(u(.))  -  J(u(.))  =  r{*,.(p(«(.),r))  -  *(;>(«(•),  m 


(3.74) 


and  from  (3.12)  we  deduce 


+£  I  ^/p(«(  ),r)x  (1  -  - — 

.  (/pwi.twi + (1 + ^)T, -.)■") x  /pr«(.b)^}(3  75) 


But  using  the  equation  (2.50)  yields  (see  (2.1a)) 

r  \  f  P(«(*).0ll*ir  dl }  rl  [‘  f  nfiif  1  rlfrl  /  d<k>  2l!IH,(2n  +  H*H*)*J 

E (/-„+ Fir-''1)  -  (')- )l)ifcr  (.+  w 

,  /r  2l|z]|I(2n-h  jjxjQ  ,  8*.-*j"*  ^ 

^  v ,J  (»  +  Ml*)*  (» +  H’)V 

(n  +  ||x||2)2  j  J  n  +  ||x||J  J 

where  we  employed  the  summation  convention  over  repeated  indices.  Hence  after  majorizing 
conveniently 

(*•«) 

We  shall  use  capital  Greek  letters,  r,A,...,  to  indicate  constants  in  the  following  estimates. 
Finally  we  deduce 


|  s  r.|/Sg..!] 

<  r,  [-  f  tt(x)||x|iVx  +  -  . 

In  J  n 


(3.77) 


Next  consider 


p(u().0 


=  o(u(),/) 


(p(u(  ),0.l)  '  w’  ' 

which  is  the  normalized  conditional  probability,  measure  and  satisfies  Kusncr’s  equation 

d(o{t)[p))  =  o(t)(Lv)dl  +  (o(t)(h^)  -  c(0(<pW0(A))  '  (dz  ~  °(0(*)d0  (3  78) 
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If  we  apply  (3.78)  with  =  ||x|]*  =  x1,  we  obtain 

<  Ao(l  +  E{c(t)(x*)l  (3.70) 

Finally 

£M0(x’}}  <  A tJ  *(*)||*||*rfr  (3.80) 

But  the  2nd  term  in  (3.75)  is 

E  {<’(r)  (*(1  +  (TT^jTTl’)  W)  (xrO  -  jTT^;)) 


s  [£  1IWT)  (*“ + rf  [£  H  l’'11  -  “’)] 

<  A1(i?{^(T)(x,)})1/I  (e  |||p(T)  ^x(1  -  )■'!) 

s  4M-i.--.vHr 


1 1* 


(!  +  ?)■'’ 
!>/* 


<  a*[e{Kn(x*wn(^)}]  • 

One  easily  checks  that 

^(pP’Mx*))*}  <  A4  +  [j  *(x)||x||Jdx)  <  A* 

("T?)  11 }  5  2£H(i^?)P(')(^?)}‘" 


(3.81) 


But 


hence 


y*  A* 

L — - — r  < 


Jn  +  x *  y/n 

"  {w<)(^)l*}  s  [a‘£  {wO(^i)l,J  +  v]  * 


(3.82) 

(3.83) 
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which  implies 


*{w,)(=£f)I}  s*  [: +  (/ £  ?(>+/'WWW  <m 

Therefore,  continuing  from  (3.81),  the  2nd  term  in  (3.75)  is  majorized  by  Collecting 
results  (from  (3.75),  (3.77),  (3.81),  (3.84))  we  can  assert  that 


|J"(u(-))  -  J(«(-))|  <  ~ 


(3.85) 

(3.86) 

(3.87) 


provided  the  initial  distribution  of  p(0),  i.e.  ic  satifies 

/  *(*)IMI4<fc  <  00 

The  estimate  in  (3.85)  is  uniform  with  respect  to  n.  Therefore 

KCK0)  ~  in/  J(u())l  < 

«(o)=»  nl/n 

p(0)  =  w 

In  fact  we  can  replace  0  by  any  t  €  [0,  T]  and  consider  the  function 

V,{x,t)=  inf  ;,(«(■))  (3.88) 

«(«)=•  '  ' 

P(0  =  « r 

where  Jf(u(  ))  corresponds  to  a  problem  analogous  to  (2.50),  (2.61)  starting  in  t  instead  of  0. 
Therefore  we  have 

|tf"(*>0  -  (3.89) 

We  have  however  to  be  careful  to  the  fact  that  the  constant  in  (3.89)  depends  on  a  bound  on 
/  7r(i)||i]|4di.  More  precisely  we  have  proved  that 

|UT(».*)  -  |  <  ~i(l  +  /  *(x)||*j|4dx)  (3.90) 

where  A’  this  time  does  not  depend  on  x  (assuming  that  n  is  a  probability).  It  follows  that 

U?(x,t)  —  l/.(*,t)  in  C(0,T;  Cx).  (3.91) 

Talcing  the  limit  in  (3.7l),  we  obtain  that  is  a  solution  of  (3.10)  and  moreover 

Ut(n,0)  =  inf  .  J(u(-))  (3.92) 

w(0)=. 


However  by  a  probabilistic  argument  already  used  in  section  S.S,  any  solution  of  (3.10)  is 
smalller  than  the  right  hand  side  of  (3.92).  This  completes  the  proof  of  Theorem  3.1,  and 
also  provides  the  same  statement  as  in  Theorem  3.2,  without  the  assumption  (3.11)  and  for 
our  original  given  by  (3.8). 
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4  The  Partially  Observed  Stochastic  Minimum 
Principle 

4.1  Introduction 

Various  proofs  have  been  given  of  the  minimum  principle  satisfied  by  an  optimal 
control  in  a  partially  observed  stochastic  control  problem.  Sec,  for  example,  the  papers 
by  Bcnsoussan  [l],  Elliott  (5),  Haussmann  [7],  ami  the  recent  paper  |9]  by  Haussmann  in 
which  the  adjoint  process  is  identified.  The  simple  case  of  a  partially  observed  Markov 
chain  is  discussed  in  the  University  of  Maryland  lecture  notes  [G]  of  the  second  author. 

We  show  in  this  article  how  a  minimum  principle  for  a  partially  observed  diffusion 
can  be  obtained  by  differentiating  the  statement  that  a  control  u*  is  optimal.  The  results 
of  Bismut  [2],  [3]  and  Kunita  [lOj,  on  stochastic  flows  enable  us  to  compute  in  an  easy  and 
explicit  way  the  change  in  the  cost  due  to  a  ‘strong  variation’  of  an  optimal  control.  The 
only  technical  difficulty  is  the  justification  of  the  differentiation.  As  we  wished  to  exhibit 
the  simplification  obtained  by  using  the  ideas  of  stochastic  flows  the  result  is  not  proved 
under  the  weakest  possible  hypotheses.  Finally,  in  Section  6,  we  show  how  Bcnsoussan’s 
minimum  principle  follows  from  our  result  if  the  drift  coefficient  is  differentiable  in  the 
control  variable. 


4.2  Dynamics 

Suppose  the  state  of  the  system  is  described  by  a  stochastic  differential  equation 
dit  = 

it  G  R\  io=xo,  o  <t<T.  (2.1) 

The  control  parameter  u  will  take  values  in  a  compact  subset  U  of  6ome  Euclidean  space  Rk . 
We  shall  make  the  following  assumptions: 

Ay :  xo  is  given;  if  xo  is  a  random  variable  and  Pq  its  distribution  the  situation  when 
/|x|’P0(dx)  <  oo  for  some  q  >  n  +  )  can  be  treated,  as  in  [9],  by  including  an  extra 
integration  with  respect  to  P0. 

A?:  f  :  (0, iT")  x  Rd  x  U  — *  Rd  is  Bore!  measurable,  continuous  in  u  for  each  (f,x), 
continuously  differentiable  in  x  and  for  some  constant  K 

(1  +  I*!)"'  |/(f,x,u)|  +  \fz(t,x,u)\  <  K, . 

A3:  g  :  (O.T'j  X  Rd  Rd ®/?n  is  a  matrix  valued  function,  Borel  measurable,  continuously 
differentiable  in  x,  and  for  some  constant  Kf 

+  lffi(<.*)l  <  k2. 

The  observation  process  is  given  by 

dyt  =  h((t)dt  +  dvt 

VteRm,  y0  =  0,  0  <t<T.  (2.2) 

In  the  above  equations  w  =  (u;1 , . . .  ,tun)  and  v  =  (vl , . . . ,  vd)  are  independent  Brownian 
motions.  We  also  assume 

>4«:  h  :  Rd  —*  Rm  is  Borel  measurable,  continuously  differentiable  in  x,  and  for  some 
constant  K3 

|h(f,x)|4  |M‘.*)I<*V 
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REMARKS  2.1.  These  hypotheses  can  be  weakened.  For  example,  in  A<,  h  can  be 
allowed  linear  growth  in  x.  Because  g  is  bounded  a  delicate  argument  then  implies  the 
exponential  Z  of  (2.3)  is  in  some  L t  space,  1  <  p  <  oo.  (See,  for  example,  Theorem  2.2  of 
[8)).  However,  when  h  is  bounded  Z  is  in  all  the  L P  spaces,  (see  Lemma  2.3).  Also,  if  we 
require  /  to  have  linear  growth  in  u  then  the  set  of  control  values  U  can  be  unbounded 
as  in  [9],  Our  objective,  however,  is  not  the  greatest  generality  but  to  demonstrate  the 

simplicity  of  the  techniques  of  stochastic  flows. 

Let  P  denote  Wiener  measure  on  the  C(jO, T],Rn)  and  p  denote  Wiener  measure 
on  C(\0,T},Rm).  Consider  the  space  fl  =  C(|0,  T|, Rn)  x  C(|0,T),Rm)  with  coordinate 
functions  (x<,y<)  and  define  Wiener  measure  P  on  fl  by 

P(dx,dy)  -  P{dx)p(dy). 

Definition  2.2.  Write  Y  =  {y<}  for  the  right  continuous  complete  filtration  on 
C((0,  T],  Rm)  generated  by  Yt°  =  o{y,  :  s  <  <}.  The  set  ot  admissible  control  functions  V 
will  be  the  Y -predictable  functions  on  jO,  Tj  X  C(|0,  £j,  Rm)  with  values  in  IJ. 

For  u  €  U  and  x  €  Rd  write  for  the  strong  solution  of  (2.1)  corresponding  to 

control  u,  and  with  (x)  =  x.  Write 

=  ^p(j‘  Hc,Az))'ds,  -  \  P-3) 

and  define  a  new  probability  measure  P*  on  fl  by  =  Zqj  (xo)-  Then  under  I 
(^t(x0),yt)  is  a  solution  of  (2.1)  and  (2.2),  that  is  ££f(x0)  remains  a  strong  solution  of 
(2.1)  and  there  is  an  independent  Brownian  motion  v  such  that  y(  satisfies  (2.2).  A  version 
of  Z  defined  for  every  trajectory  y  of  the  observation  process  is  obtained  by  integrating  by 
parts  the  stochastic  integral  in  (2.3). 

Lemma  2.3.  Under  hypothesis  Ait  for  t  <  T, 

E[(ZXf{x  0))pl  <  oo  for  a//  u  €  U  and  all  p,  1  <  r  <  oo 
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PROOF. 


Zo't  (*o)  =  1  +  f  Zo.r  (*o)Ydyt . 

Jo 

Therefore,  for  any  p  there  is  a  constant  Cp  such  that 

E[{ZSj  (x0))P]  <C„[  1  +  {Zlr^o))7h^Axo)?dr)rli]. 

The  result  follows  by  Gronwall’s  inequality. 

COST  2.4.  We  shall  suppose  the  cost  is  purely  terminal  and  given  by  some  bounded, 
differentiable  function 

c((o,T  (xo)) 

which  has  bounded  derivatives.  Then  the  expected  cost  if  control  u  £  U  is  used  is 

J(u)  =  Eu  [ c ( £o,t  (3’o))j  ■ 

In  terms  of  P,  under  which  yt  is  always  a  Brownian  motion,  this  is 

J(t)  =  £^(lo)c((oV(xo))].  (2-4) 
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4.3  Stochastic  Flows 
For  u  €  U  write 

£,“<  (x)  =  x  +  J'  f(r ,  (x),  uT)dr  +  J'  g{r,  ("r  (z))dwr  (3.1) 

for  the  solution  of  (2.1)  over  the  time  interval  js,t)  with  initial  condition  £“,»(*)  =  *• 
the  sequel  we  wish  to  discuss  the  behaviour  of  (3.1)  for  each  trajectory  y  of  the  obser  vation 
process.  We  have  already  noted  there  is  a  version  of  Z  defined  for  every  y.  The  results  of 
Bismut  (2j  and  Kunita  (lOj  extend  easily  and  show  the  map 


:  Rd  -  Rd 


is,  almost  surely,  for  each  y  €  C((0,  Tj,  J?m)  a  diffeomorphism.  Bismut  [2)  initially  gives 

proofs  when  the  coefficients  f  and  g  are  bounded,  but  points  out  that  a  stopping  time 

argument  extends  the  results  to  when,  for  example,  the  coefficients  have  linear  growth. 

Write  |ieu(xo)l|r  =  sup  ICo  #  (xo)(-  Then,  as  in  Lemma  2.1  of  [8],  for  any  p, 
o  <«<r 

1  <  p  <  oo  using  Gronwall’s  and  Jensen’s  inequalities 

|in*o)||£  <  C  (l  4  |ro!P  +  |  f  9{r,  $,r  {*<>))  dwr  f) 

"  almost  surely,  for  some  constant  C. 

Therefore,  using  Burkholder’s  inequality  and  hypothesis  A3,  j| ^ “  (x0 ) ||t~  is  in  LF  for 
all  p,  1  <  p  <  oo. 

Suppose  u*  €  U  is  an  optimal  control  so  J(u')  <  J{v)  for  any  other  v  €  U .  Write 

(*  (  (-)  for  (if  (•).  The  Jacobian  -~i-  (a)  is  the  matrix  solution  Ct  of  the  equation  for  s  <  t, 
’  1  ox 

n 

dCt  =  /,(«,  CjW,  «*)C«A  +  C,.t(T))Ctdw't  (3.2) 

»=1 

with  C,  =  /. 

Here  /  is  the  n  x  n  identity  matrix  and  is  the  »lh  column  of  g.  From  hypotheses  Ai 

and  j4j,  /,  and  gx  are  bounded.  Writing  ||C||r  =  Bup  |C,|  an  application  of  Gronwall’s 

o<#« 


90 


1.6 
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Jensen’s  and  Burkholder’s  inequalities  again  implies  ||C||r  is  in  U  for  all  p,  1  <  p  <  «*> 
Consider  the  related  matrix  valued  stochastic  differential  equation 

D,  =  /  -  £  O' dr 

cw)'*: 

+  £  /V  (<?<*>  (r,  tlAxWfdr.  (3-3) 

.= 1  Jt 

Then  it  can  be  checked  that  DtCt  =  1  for  t  >  a,  60  that  Dt  is  the  inverse  of  the  Jacobian, 
that  is  Dt  =  ( )  •  Again,  because  /,  and  gz  are  bounded  we  have  that  ||D||t  is  in 

every  Lr ,  1  <  p  <  oo. 

For  a  d-dimensional  semimartingale  zj  Bismut  [2]  shows  one  can  consider  the  flow 

and  gives  the  semimartingale  representation  of  this  process.  In  fact  if  zt  —  z,  + 
n  t 

At  +  22  f,  H,dw'r  is  the  d-dimensional  semimartingale,  Bismut’s  formula  states  that 

i=i 


c.:M  -  *•  +  J  (/(-.  o 

+ ±d>(r.  c,m.  («,)//.- + j  i: 

+  cw)^w»,K  (3<) 

DEFINITION  3. 1 .  We  sha/f  consider  perturbations  of  the  optimal  control  u*  of  the  fol¬ 
lowing  kind:  For  s  £  (0,T),  h  >  0  such  that  0<s<s  +  h<T,  for  any  other  admissible 
control  u  €  U  and  A  €  Y,  define  a  strong  variation  of  u*  by 

i  (s.s  +  h)  x  A 
\  ti {<,«;)  if  (t,u>)  €  [s,«  +  h]  x  A. 


Applying  (3.4)  as  in  Theorem  5.1  of  (4)  we  have  the  following  result. 


THEOREM  3.2.  For  the  perturbation  u  of  the  optimal  control  u*  consider  the  process 
*<  =  *+  J'  (/(r,  e;,r(*r).  «r)  ~  /(*".  Wf‘))dr.  (3.5) 
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Then  the  process  ,  (e<)  is  indistinguishable  from  (*t  (x). 

PROOF.  Note  the  equation  defining  r<  involves  only  an  integral  in  time;  there  is  no 
martingale  term,  ao  to  apply  (3.4)  we  have  Hi  =  0  for  all  i.  Therefore,  from  (3.4) 

[*  T  (zr)  \  {d£t,t  izr)  \ _1  ,  t,  r.  ,  s  \  rt  +•  t  \  »\j 

+  J  ( - Yx - ){ - Yx - )  U(r*  Cv(*r).  ~  /(r»  €.,r(*r).  «f)dr 

+  ^  9(r,  C,,,(*r))dwr. 

However,  the  solution  of  (3.2)  is  unique  so 

&(*>  =  £«(*). 

REMARKS  3.3.  Note  that  the  perturbation  u(t)  equals  u*  (t)  if  f  >  s  +  h  so  Zt  —  zJ+j, 
if  t  >  s  +  h  and 

CM  =  CM*)  =  C».<  !C*»  Ml- 


92 


4.4  Augmented  Flows 

Consider  the  augmented  flow  which  includes  as  an  extra  coordinate  the  stochastic 
exponential  Z\t  with  a  ‘variable’  initial  condition  z  6  R  for  Z\yt  (•).  That  is,  consider  the 
(d  +  l)  dimensional  system  given  by: 

(*)  =  1  +  ^  /(r»  C.r  (*).  +  J  g(r,  r  {x))dwT 

z;,t  (*.*) =  j‘ *:,(*' (*))'<**■ 

Therefore, 


z;it{x,z)  =  zz;yT{x) 

=  Zexv{jt  h(CAX))'dyr  ~\jt  h(t',r(X))2<lr) 

and  we  see  there  is  a  version  of  the  enlarged  system  defined  for  each  trajectory  y  by  inte¬ 
grating  by  parts  the  stochastic  integral.  The  augmented  map  (a :,  z)  — *  (£*t  (i),  Z’l  t  (x,z)) 
is  then  almost  surely  a  diffeomorphism  of  Ri+1 .  Note  that  =0,  =  0  and 

§*■  ==  0.  The  Jacobian  of  this  augmented  map  is,  therefore,  represented  by  the  matrix 

/  ULiil  n  \ 


(*■*) 


and  for  1  <  t  <  d  from  equation  (3.3) 

dz;Az,)_£j< 


dh’UlA*))  aCk.,,ix) 

dik  dz, 


zA=l 


(Here  the  double  index  k  is  summed  from  1  to  n). 

We  shall  be  interested  in  the  solution  of  this  differential  system  (4.1)  only  in  the 
situation  when  *  =  1  so  we  shall  write  Zjj(z)  for  Z\yX (x,  1).  The  following  result  is 
motivated  by  formally  differentiating  the  exponential  formula  for  Z\x  (z). 
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Lemma  4.1. 


=  *;,w 

where  v  =  (t;1 , . . .  ,t/n)  is  the  Brownian  motion  in  the  observation  process. 

PROOF.  Ftom  (4.1)  we  see  is  the  solution  of  the  stochastic  differential  equa¬ 


tion 


Write 


where 


Because 


_  j'  (£^(f )*■({,-, (xj)  +  z;.. (*)<.. (4  2) 

W  =  2./  <*)(  J'kf  ^  •<!•>,) 

<fyr  =  MC*  (x))d<  4  dll,. 

Z:A*)  =  '+J  ZlrWtil.r  {*))dyr 


the  product  rule  gives 

L.A*)  =  f' z:.Ax)h  d(:-' 


dx 


dvr 


+ /  ( [ h*  • 

=  J'L,A  *)V(e;.r(*))d»r  +  f*  z;t(x)hx  -  d~dy,. 

Therefore,  L(>((x)  is  also  a  solution  of  (4.2)  so  by  uniqueness 

t.,  (,)-*$«. 

<7X 

REMARKS  4.2.  As  noted  at  the  beginning  of  4his  section  we  can  consider  the 
augmented  flow 

(*.*)  ((?,♦(*).  2.\fx,*))  fo*-  x  €  rG/I, 

and  we  are  only  interested  in  the  situation  when  *  =  1,  so  we  write  2,‘(  (x). 
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Lemma  4.3.  z;A*t)  =  z:,{x)  where  tt  is  the  semimertingzle  defined  in  (3.6). 


PROOF.  Z*t  ( x )  is  the  process  uniquely  defined  by 

Kt  (*)  =  !  +  £  Kr  W(C.r  (x))dyr.  (4.2) 

Consider  an  augmented  (d  +  l)  dimensional  version  of  (3.6)  defining  a  6emimartinga!e 
*t  =  (zt,  1),  so  the  additional  component  is  always  identically  1.  Then  applying  (3.5)  to 
the  new  component  of  the  augmented  process  we  have 

z:,r  (Zr)  =  l  +  £  Zl'T  (zT)h'(C<r  (zr))dyT 
=  1  +  £  Z;tr(2r)h'{C'r{x))dyr 

by  Theorem  3.2.  However,  (4.2)  has  a  unique  solution  so  Z’t  (z()  =  Z*t  (x). 

REMARKS  4.4.  Note  that  for  t  >  s  +  h 


4.5  The  Minimum  Principle 

Control  ti  will  be  the  perturbation  of  the  optimal  control  u*  as  in  Definition  3.1.  We 
shall  write  i  =  (o»(xo)-  Then  the  minimum  cost  is 

J(u*)  =  £[Z0-ir(xo)c(^r(io))) 

=  E\Z^(xo)Z;iT(x)c(C.iT(x))\. 


The  cost  corresponding  to  the  perturbed  control  u  is 

J(u)  =  £|^(io)Z“T(i)c((“r(r))] 

=  Eiz;iA*o)z;r(z.+k)c{t:.T{*.+K))) 

by  Theorem  3.2  and  Lemma  4.3.  Now  Z’T(-)  and  c(£'T(  ))  are  differentiable  with  con¬ 
tinuous  and  uniformly  integrable  derivatives.  Therefore 


where 


J(u)  -  J(u‘)  =  E\Z‘0t,  (xo )(Z,V  [z„K  ))  -  Z;,T  (*)c(C,T  (*)))] 

=  E[J'*  r(s,Xr)(/(r,  ur)  -  /( r,  £,(*),  u'r))dr] 

=  Zo'.  (xo  )z;T  (zf){c((e:,T  -r 

cU:,T{*r))(  J*  hdC.A*'))^MdVo)}(^{Zr))~'  . 


Not c  that  this  expression  gives  an  explicit  formula  for  the  change  in  the  cost  resulting  from 
a  variation  in  the  optimal  control.  The  only  remaining  problem  is  to  justify  differentiating 
the  right  hand  side. 

FYom  Lemma  2.3,  Z  is  in  every  1/  space,  1  <  p  <  oo  and  from  the  remarks  at  the 
beginning  of  Section  3,  Ct  =  and  Dp  =  are  in  every  U  space,  1  <  p  <  oo. 

Consequently,  T  is  in  every  1/  space,  1  <  p  <  oo. 
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Therefore 


J(u)-  J(u')  =  J'+K  E[{r(»,zr)-T(s,x))(f(r,  e.V(*r),  «r)  -  /(r,  £,(*,).  <))]* 
+  ^‘+A  £?[(r(3,x)  -  r(r,x))(/(r,  e;.r(^),  ur)  -  /( r,  C,r  (*r),  «;))]dr 

+  f‘*k  ^[r(r,*)(/(r,  e;,r(xr), «,)  -  /( r,  «;) 

C-W.  «r)  +  /(r,  e;>f(x),  <))]dr 
+  /  •E[r(r»I)(/(r>  ^O.rfco),  Ur)  -  /(r.fo.r^o).  ",*))]* 


“/iW+MM  +  MM  +  M*),  ^y. 


!MM!  <  J  +  £[|r(«,*,)  -  r(s,x)|(i  +  nr(*o)l|.+*  )]*■ 

<*>*  sup  r[|r(«,*r)-r(«,x)|(i  +  |ir(*0)ll.+k)l 

*<r<t+k  1  J 

IMMI  <  k2  J'*k  f:[|r(s,x)  -  r(r,x)((i  +  ||f*(*0)||.+fc  )]* 
<K2h  sup  £[|r(s,xr)-r(r,x)|(l  +  ||r(xo)|i.+J 

»<r<«+A  l  J 

|/3(h)|  <  K3  j'*k  r[|r(r,x)|  II x  -  zr\\]dr 

<  K3h  sup  ^[|r(r,x)|  ||x  -  x.||,+*j. 


The  differences  |r(s,xr)  -  r(s,x)|,  |r(«,x)  -  r(r,x)|  and  ||x  -  x||,4*  arc  all  uniformly 
bounded  in  some  1/ ,  p  >  1,  and 


Um  |r(«,xr)  -  r(s,x)|  =  0  a.s. 
^lim  |r(s,x)  -  r(r,x)|  =  0  a.s. 

I'™  II1  “  *-H«+k  =  °- 


i 


Therefore, 


lim  ||r(a,zf)  -  r(3,i)||p  =  0 
Jim  ||r(s,z)  -  r(r,i)||p  =  0 
and  lim  ||{||i  -  a||#4k)||p  =  0  for  6ome  p. 


Consequently,  lim  h.  1 1 *(h)  =  0,  for  k  =  1,2,3. 

A— »0 

The  only  remaining  problem  concerns  the  differentiability  of 

UW  =  f‘+k  E  [r(r,i)(/(r,  ^r(i0),  ur)  -  /( r,  £,  (««,),  txr*))]dr. 

The  integrand  is  almost  surely  in  L'dO.Tj)  so  lim  h~l  I t(h)  exists  for  almost  every  s  G 

*0 

[0,Tj.  However,  the  set  of  times  {s}  where  the  limit  may  not  exist  might  depend  on  the 
control  u.  Consequently  we  must  restrict  the  perturbations  u  of  the  optimal  control  u*  to 
perturbations  from  a  countable  dense  set  of  controls.  In  fact: 

1)  Because  the  trajectories  arc,  almost  surely,  continuous,  Yp  is  countably  generated 
by  sets  { Axp },  i  =  1,2,...  for  any  rational  number  p  €  [0,T].  Consequently  Yt  is 
countably  generated  by  the  sets  {j4,p},  r  <  t. 

2)  Let  Gt  denote  the  set  of  measurable  functions  from  (n,yt)  to  U  C  Rk .  (If  u  G  U 

then  u(<,tn)  €  Gt.)  Using  the  L*-norm,  as  in  (5),  there  is  a  countable  dense  subset 

Up  —  {«>*>}  of  Gp,  for  rational  p  G  |0,T’|.  If  //*  =  (J  Hp  then  Hi  is  a  countable 

p<t 

dense  subset  of  Gt.  If  G  Hp  then,  as  a  function  constant  in  time,  u]P  can  be 
considered  as  an  admissible  control  over  the  time  interval  (<,!T]  for  t  >  p. 

3)  The  countable  family  of  perturbations  is  obtained  by  considering  6cts  Axp  €  Yt, 
functions  U}P  e  Ht,  where  p  <  t,  and  defining  as  in  3.1 


f  u*  (s,  u>)  if  (s,w)  f  (t.Tj  x  Aip 

\  «;>(«,  w)  if  (a,w)  6  |f,T)  x  Axp. 


Then  for  each  » ,  j,  p 


lim  h~l 
K  — *0 


£[r(r,*)(/(r,  (o,r  (*o), 


”/(r.  £o,r  (zo).  «'))]<*»• 


(5.1) 
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exists  and  equals 


E |r(j, x)[f(a,  £o,*(l0)'  uif)  ~  /(*»  £o,*(*<0»  u  ))^v>] 
for  almost  all  s  €  (0,Tj. 

Therefore,  considering  this  perturbation  we  have 

lim  h-1  {J(u‘p)  -  J(u))  =  E[r(a,x)(f{s,  ^f.(x0),  «;,)  -  /(s,  ^  (* o),  «**))/*,] 

>  0  for  almost  all  s  €  [0,Tj. 


Consequently  there  is  a  set  S  C  |0,T]  of  zero  Lebesgue  measure  such  that,  if  s  ^  S,  the 
limit  in  (5.1)  exists  for  all  i,j,p,  and  gives 


£[i>, *)(/(«,  eo,(*o).  «,>)-/(«,  es^(*o),  0)u„ 


>  0. 


Using  the  monotone  class  theorem,  and  approximating  an  arbitrary  admissible  control 
v  €  U  we  can  deduce  that  if  s  $  S 


£[r(a, .)(/(«,  «,.(xo),  «)  ~  /(«,  «,.(*<>)»  «•))/>»]  >0  for  any  ,4  6  Y,.  (5.2) 


Write 


p#(i)  =  E‘  [c((Co,r  (*<>))-  — -  +c(^o,t(xo))(^  (*o)) "  -  ™  -  dtv  )  I  >'.v{x)j 

where,  as  before,  x  =  ^  ,  (x0)  and  E'  denotes  expectation  under  P*  =  P“  .  Then  p,(x) 
is  the  co-etate  variable  and  we  have  in  (5.2)  proved  the  following  ‘conditional’  minimum 
principle: 


THEOREM  5.1.  If  u'  €  U_  is  an  optimal  control  there  is  a  set  S  C  {0,  T J  of  zero  Lebesgue 
measure  such  that  if  s  $  S 


£>•(*)/(*,*. u*)  |  Y,\  <  JT (p,(x)/(s,x,u)  |  y.)  a.s. 

That  is,  the  optimal  control  u*  almost  surely  minimizes  the  conditional  Hamiltonian  and 
the  adjoint  variable  is  p,(x). 
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4.6  Conclusions 


Using  the  theory  of  stochastic  flows  the  effect  of  a  perturbation  of  an  optima)  control 
is  explicitly  calculated.  The  only  difficulty  was  to  justify  its  differentiation.  The  adjoint 
process  is  explicitly  identified  as  pg( x). 


THEOREM  6.1.  If  f  is  differentiable  in  the  control  variable  v,  and  if  the  random  variable 
x  ~  to,.  (zo)  has  a  conditional  density  qt(z)  under  the  measure  P' ,  then  the  inequality  of 
Theorem  5.1  implies 


k 


u]  (<))  /  r(s. x)  («,  h*  ix)dx  <  o. 

Jft'  OUj 


This  is  the  result  of  Bensoussan’s  paper  [1]. 

The  method  of  this  paper  can  be  applied  to  completely  observable  systems  by  ini¬ 
tially  considering  ^stochastic  open  loop’  controls,  systems  with  stochastic  constraints  and 
deterministic  systems.  The  adjoint  process  can  be  explicitly  identified.  ‘Almost  minimum’ 
principles  for  ‘almost  optimal’  controls  can  be  obtained.  Some  of  these  will  be  discussed 
in  later  work. 
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5  The  Conditional  Adjoint  Process 

5.1  Introduction 

Using  stochastic  flows  we  calculate  below  the  change  in  the  cost  due  to  a  ‘strong’ 
variation  of  an  optimal  control.  Differentiating  this  quantity  enables  us  to  identify  the 
adjoint,  or  co-state  variable,  and  give  a  partially  observed  minimum  principle.  If  the  drift 
coefficient  is  differentiable  in  the  control  variable  the  related  result  of  Bensoussan  [2]  follows 
from  our  theorem.  Full  details  will  appear  in  [lj.  The  method  appears  simpler  than  that 
employed  in  Haussman  {4). 

5.2  Dynamical  Equations 

Suppose  the  state  of  a  stochastic  system  is  described  by  the  equation 

dit  =  f(t ,  Ct .  u)dt  +  g (t,  )dwt , 

(teRd,  6>  =  z0,  o  <  t  <  r.  (2.1) 

The  control  variable  u  will  take  values  in  a  compact  subset  U  of  some  Euclidean  space  Rk . 

We  shall  assume 

Al :  z0  €  Rd  is  given. 

A2 :  /  :  [0,7')  x  Rd  x  U  -  Rd  is  Borel  measurable,  continuous  in  u  for  each  ( t,z ), 

continuously  differentiable  in  z  for  each  (t,u)  and 

(1  +  )z))-1  j/(t,z,u)j  +  S.Mt.z.u))  <  Kj . 

A3 :  g  :  [0,  T]  x  Rd  —* ►  Rd  0  Rn  is  a  matrix  valued  function,  Borel  measurable,  continuously 
differentiable  in  z,  and  for  some 

z)|  +  <  K2. 

The  observation  process  is  defined  by 

dy  i  —  h((/)dt  +  dut  (2-2) 

V*  €  Rm  i  Vo=0,  0<t<T. 
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In  (2.1)  and  (2.2)  tv  =  (u/1 , .  ..,«/*)  and  v  =  (i/1 , . . .  ,vm  )  are  independent  Brownian 
notions  defined  on  a  probability  space  (Q ,F,P). 

Furthermore,  we  assume 

Aa :  h  :  Rd  —  Rm  is  Borel  measurable,  continuously  differentiable  in  z  and 

(A(f,*)l+IMM)!<*3. 


REMARK  2.1.  These  hypotheses  can  be  weakened  to  those  discussed  by  Hauss- 
man  [4].  See  (lj. 

Write  P  for  the  Wiener  measure  on  C([0,  T],  Rn )  and  p  for  the  Wiener  measure  on 

c(io,ri,/?m). 


n  =  c([o,r),Rn)  x  c(|o,r],/?m) 

and  the  coordinate  functions  in  Cl  will  be  denoted  (x,,y().  Wiener  measure  P  on  fi  is 

P[dz,dy)  =  P{dx)p[dy). 


DEFINITION  2.2.  Y  =  {y,}  will  be  the  right  continuous,  complete  filtration  on 
C(jO,  T),Rm)  generated  by 

yt°  =  °{y,  ■  s  <t}. 

The  set  of  admissible  control  functions  U_  will  be  the  Y -predictable  functions  defined  on 
[0,  Tj  x  C([0,  T\,Rm)  with  values  in  U. 

For  ueV  and  z  E  Rd ,  £“  {(z)  will  denote  the  strong  solution  of  (2.1)  corresponding 
to  u  with  =  i. 

DeRne 

z:jx)  =  (j'HC.A*)Ydy.  -  IJ'hc.A*))'*')-  (2.3) 

Note  a  version  of  Z  defined  for  every  trajectory  y  can  be  obtained  by  integrating  the 
stochastic  integral  in  the  exponential  by  parts. 
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If  a  new  probability  measure  Pu  defined  on  fi  by  putting 


dP  ~  ZoJ  (xo), 

under  P%  (Q  t(x0),yt)  is  a  solution  of  the  system  (2.1)  and  (2.2).  That  is,  under  Pv , 
t(z0)  remains  a  strong  solution  of  (2.1)  and  there  is  an  independent  Brownian  motion 
such  that  yt  satisfies  (2.2). 

Because  of  hypothesis  A4,  for  0  <  t  <  T  easy  applications  of  Burkholder’s  and  Gron- 
wall’s  inequalities  show  that 

E{(ZZ't(z0)y\<oo  (2.4) 

for  all  u  €  U  and  all  p,  1  <  p  <  oo. 

COST  2.3.  We  shall  suppose  the  cost  is  purely  terminal  and  equals 

cUo,t  ( xo )) 

where  c  is  a  bounded,  differentiable  function.  If  control  u  G  17  is  used  the  expected  cost  is 

J(«)  -  £.Mfo‘.r(*o))]- 

W'ith  respect  to  P ,  under  which  yt  is  a  Brownian  motion 

j(.)«fi[z0«t7.(^)c(ejtr(*b))!-  (2-5) 

A  control  u*  6  U  is  optimal  if 

J(u-)  <  J(u) 

for  all  ueU..  We  shall  suppose  there  is  an  optimal  control  u*. 
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5.5  Flows 


For  u  €  \L  and  x  €  Rd  consider  the  strong  solution 

C,i (*)  ~z  +  Jt  f(r^t,r(x)^r)dr  +  j  g{r,Z*r{x))du>T.  (3.1) 

We  wish  to  consider  the  behaviour  of  (x)  for  each  trajectory  y  of  the  observation 
process.  In  fact  the  results  of  Bismut  (3)  and  Kunita  (6j  extend  and  show  the  map 

C.t  :  Rd  -  R* 

is,  almost  surely,  a  diffeomorphism  for  each  y  €  C([0,  T),  Rm  ). 

Write 

ll£“(*o)lli  =  SUP  ko\(*o)l- 

0<»<f 

Then,  using  Gronwall’s  and  Jensen’s  inequalities,  for  any  p,  1  <  p  <  oo 

ll*“(*o)ilr  <  C(  1  +  |*0f  +  |  j*  g(r,Q  r(x0))dwt\P) 

almost  surely,  for  some  constant  C. 

Using  -i43  and  Burkholder’s  inequality 

t!  (a^o )  Hr  ^  •C'p  for  1  <  p  <  oo. 

Suppose  u*  is  an  optimal  control,  and  write 


*;.«(•)  for  C, ’«(•)• 


The  Jacobian 

O  X 


is  the  matrix  solution  Ct  of  the  equation 


with  C,  =  /. 


dCt  =  {x),u-)C,dt  +  '529*)(t’e:A*))C,dw). 

i*=J 


(3.2) 
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Here  g(')  is  the  «’th  column  of  g  and  1  is  the  n  x  n  identity  matrix.  Writing  ||C||j-  = 
sup0<,<<  \c.\  and  using  Burkholder’s,  Jensen’s  and  Gronwall’s  inequalities  we  see  ||C||r  6 
Lp ,  1  <  p  <  oo. 

Consider  the  matrix  valued  process  D  defined  by 


A  =  /-^‘x>r/,(r,e;.,(x),«;)dr 

-  E +  tlJ'DrilPir'CA*)))** 

•=  l  4  »=i 

Then  as  in  [5]  or  (6]  d(DtCt )  =  0  and  D,C,  —  1  so 


(3.3) 


A  =  c; 1 


Furthermore,  ||D|[t  €  L? ,  1  <  p  <  oo. 

Suppose  x,  =  x,  +  j4<  +  537=1  *s  a  <f'dimensional  semimartingale.  Bismut 

[3]  shows  one  can  consider  the  process  t  )  a«d  in  fact: 


=  *•  +  f'  (/(**. (*r). **r ) 

+  E**,)(r*  £.'(*)♦  u») 

«=i 

«,))* 


+  ;£ 


2  ^  ax2 

t=2 


+  J‘^MdA,  +  £  J'{>U)  A*))  +  W- 
•  =  1 


(3.4) 


DEFINITION  3.1.  For  s  G  [0,  F],  h  >  0  such  that  0  <  s  <  s  +  h  <  T,  for  any  u  €  LL>  *n<* 
>4  €  Fj  consider  a  ‘strong’  variation  u  of  u*  defined  by 

u*(t,w)  if  [t,w)  $  |a,«  +  h]  x  M 
,  w)  if  [t,w)  G  («,s  +  h]  x  A. 


[  «’(* 
u(t,tn)  =  { 

l  «(*.< 
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THEOREM  3.2.  For  any  strong  variation  u  of  u*  consider  the  process 

zt^x+  j‘(^(Zr)y\f{r,CAz'M-f(r,C.A^<))dr-  M 

Then  the  process  C,itizt)  «  indistinguishable  from  £“,(z). 

PROOF  We  shall  substitute  in  (3.4),  (noting  H{  =  0  for  all  i).  Therefore, 

V,Azt)  ~  x  +  f'  /(r»C.r(«r),u;)dr 

+  /  8fri 

The  solution  of  (3.1)  is  unique,  so  £‘  ,(2*)  =  f“t(z).  Note  u(t)  =  u‘(f)  if  t  >  s  +  h  so 
zt  =  z,+  *  if  £  >  s  +  /i  and 

=  e;+fc,t(C.+k(*))-  <3-6) 


5.4  The  Exponential  Density 

Consider  the  (d  +  l)-dimensional  system 


£,<(*)  =  *+ ff{r,t:Ax)>  u'r)dr  + 

*;.,(*.*)  =  *  +  jrtz;fr(xf*)Me;.r(*)),rfyr.  (“1) 

That  is,  we  are  considering  an  augmented  flow  ((,  Z)  in  i?rf+  1  in  which  Z’  has  a  variable 


initial  condition  z  €  R.  Note: 


i 


The  map  (*,*)  — »  [t]  t(x),  Z*  t(z,z))  is,  almost  surely,  a  diffeomorphism  of  RiJr  1 .  Clearly, 

dV.  t  df  da 

~?T~  =  °’  T  ~  0  and  ~  0 

oz  az  dz 

The  Jacobian  of  this  augmented  map  is  represented  by  the  matrix 


Ct  = 


o 

oz; ,  oz; , 

0  z  8  z 


In  particular,  from  (4.1),  for  1  <  i  <  d 

az;.  ft  JL  dh’  dck  t  r 

k=  1 


5s ■£/*«>.•> <«> 

1  j= 1  k= \ 


We  are  interested  in  solutions  of  (4.1)  and  (4.2)  only  when  z  =  1,  so  as  above  we  write 


Lemma  4.1. 


Z‘tt{x)  for  Zt\t(z,  1)  etc. 

dz;t 

•  •*  rj  • 


<?x 


where,  as  in  (2.2),  di/t  —  dyt  -  /i(£*  ^  {x))dt. 
PROOF  From  (4.2) 

az.\.  _  r  t™:.. 


=  [  W)  +  z;-'  <*>**  (*» •  (43) 


chr 


Write 


Then 


=  Z'.,AX)(J'  h *  ■ 

*;.»(*)  =  i + J'  z:A*)h'(t:A*))dyr 


and  the  product  rule  gives 


£*.»(*)  =  J  i|,rW^((,',rW)4 

+  jtZ:.rW>  -£Ldyr. 
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Consequently,  L,((x)  is  also  a  solution  of  (4.3),  so  by  uniqueness 


L.A*) 


LEMMA  4.2.  If  zt  is  as  defined  in  (3.5) 


*;,,(*<)  =  z.v*)- 

PROOF 

Applying  (3.4)  to  Z)  t(zt)  we  see: 


(4,1) 


z;A*r)  =  i  +  J*z;A*T)h'(t:A*r))<ivT 

=  1  +  £  Z;ir(Zr)h’[tlAX))dVr 

by  Theorem  3.2.  However,  (4.4)  has  a  unique  solution  so 

Z;,t(2r)  =  Z*f(l). 

Again,  note  that  for  t  >  s  +  h 

z;t(zt)  =  z;t(z,+  h).  (4.5) 


5.5  The  Adjoint  Proceaa 

u*  will  be  an  optimal  control  and  u  a  perturbation  of  u‘  as  in  Definition  3.1.  Again 
write 


*  =  fo, ,(*<>)• 
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The  minimum  cost  is 


J(u‘)  =  £lZ0,>T(x0)c(£o,r(Io))) 

=  E\z‘tl(x0)z;T(x)c(e;tT(x))}. 


J(u)  =  £[Z*  4(i0)2;t(i)c(^*t(i))1 

=  E[Z^(x0)Z;T(z.+lt)c[CttT{zl  +  K))) 

by  (3.6)  and  (4.5).  Recall  Z't  T  (•)  and  c(£*7(-))  are  differentiable  almost  surely,  with 
continuous  and  uniformly  integrable  derivatives.  Consequently,  writing 

rt<,*,)  =  2U*0)2;.t(*'){'<(Ct(--'))  (*) 

*,<«..<*»  (*’)*•'')}  {%-( *))'' 

for  s  <  r  <  s  ■+  h,  we  have 

J(u)  -  J(u-)  =  £lZo*..(*o){^;,«(*.+k)e({;,i(*.+0)  -  2;r(x)cu;T(x))}] 

=  e[  r  r(*,*f)(/(r,e;fr(*r),«,)-/(r,e;,r(x),u;))drl. 

J  (5-1) 

Th  is  formula  describes  the  change  in  the  expected  cost  arising  from  the  perturbation  u  of 
the  optimal  control.  However,  J(u)  >  J(u‘)  for  all  u  €  U  so  the  right  hand  side  of  (5.1) 
is  non-negative  for  all  h  >  0.  We  wish  to  divide  by  h  >  0  and  let  h  —*  0.  This  requires 
some  careful  arguments  using  the  uniform  boundedness  of  the  random  variables  and  the 
monotone  class  theorem.  It  can  be  shown  that  there  is  a  set  5  C  [0,T]  of  zero  Lebesgue 
measure  such  that  if  s  $  S 

*)(/(*,  (*0). «)  -  /(*.  (*o ). «:  ))/>»!  >o  (5.2) 
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for  any  u  €  U  and  A  €  Y, . 

Details  of  this  argument  can  be  found  in  [lj.  Define 

,.(*)  =  £•[*(  («,,.(*„)) -jfw 

+«(«.r(*o))(/r  *<(«..  M)  d-%r  w^.)|  nv{x}] 

where  i  =  J(io)  -E’  the  expectation  under  P‘  =  Pu  . 

In  (5.2)  we  have  established  the  following: 

THEOREM  5.1.  p,  (1)  is  the  adjoint  process  for  the  partially  observed  optimal  control 
problem.  That  is,  if  u‘  €.  U  is  optimal  there  is  a  set  S  C  |0,  T)  of  zero  Lebesgue  measure 
such  that  for  s  £  S 

£7*  (p3  (x)/(s,  x,«* )  |  V,  1  >  £7*  {p,  (x)/(s,  x,  ti)  }  V,  ]  a.s.  (5.3) 

so  the  optimal  control  u*  almost  surely  minimizes  the  conditional  Hamiltonian. 

If  x  —  s{x0)  has  a  conditional  density  q,(z)  under  P‘,  and  if  /  is  differentiable 

in  u,  (5.3)  implies 

53 («,•(*)  “  UJ(5))  JRd  r(5’z)  §~  (s,x,u‘)q,{x)dx>  0. 

This  is  the  result  of  Bensoussan  [2]. 

REFERENCES 

1.  J.  Baras,  R.J.  Elliott  and  M.  Kohlmann,  The  partially  observed  stochastic  minimum 
principle.  University  of  Alberta  Technical  Report,  1987,  submitted. 

2.  A.  Bensoussan,  Maximum  principle  and  dynamic  programming  approaches  of  the  op¬ 
timal  control  of  partially  observed  diffusions.  Stochastics,  9(1983),  169-222. 

3.  J.M.  Bismut,  A  generalized  formula  of  Ito  and  some  other  properties  of  stochastic 
flows.  Zeits.  fur  Wahrs.  55(1981),  331-350. 

4.  U.G.  Haussmann,  The  maximum  principle  for  optimal  control  of  diffusions  with  partial 
information.  S.I.A.M.  Jour.  Control  and  Opt.  25(1987),  341-361. 


Ill 


5.  N.  Iked*  and  S.  Watanabe,  Stochastic  differential  equations  and  diffusion  processes 
North  Holland  Publishing  Co.,  Amsterdam,  Oxford,  New  York,  1981. 

6.  H.  Kunita,  The  decomposition  oj  solutions  oj  stochastic  differential  equations.  Lecture 
Notes  in  Math.,  851(1980),  213-255. 


112 


la  REPORT  SECURITY  CLASSIFICATION 
UNCLASSIFIED 


'a.  SECURITY  CLASSIFICATION  AUTHORITY 


2b  DECLASSIFICATION,  DOWNGRADING  SCHEDULE 


>  PERFORMING  ORGANIZATION  REPORT  NUMBER(S) 

SEI  TR-87-13 


REPORT  DOCUMENTATION  PAGE 


lb  RESTRICTIVE  MARKINGS 


Form  Approved 
OM8  No  0704-0188 


3  DISTRIBUTION /AVAILABILITY  OF  REPORT 


5  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 


ia  NAME  OF  PERFORMING  ORGANIZATION 
Systems  Engineering,  Inc. 


So  ADORESS  (C/ry,  State,  and  ZIP Code) 

7833  Walker  Drive,  Suite  308 
Greenbelt,  MD  20770 


3a.  NAME  OF  FUNDING  /  SPONSORING 
ORGANIZATION 


Bo  ADDRESS  (City,  State,  and  ZIP  Coda) 


6b  OFFICE  SYMBOL  7a.  NAME  OF  MONITORING  ORGANIZATION 
(If  applicable) 

U.S.  Army  Research  Office 


7b  ADDRESS  (C/ty.  State,  and  ZIP  Code) 

P.0.  Box  12211 

Research  Triangle  Park,  NC  27709-2211 


8b  OFFICE  SYMBOL  9  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 
(If  applicable) 

DAAL03-86-C-0014 


10  SOURCE  OF  FUNDING  NUMBERS 


PROGRAM 

PROJECT 

TASK 

1 

ELEMENT  NO. 

NO 

NO 

11.  TITLE  (include  Security  Clarification) 

|  Analytical  Methods  in  Stochastic  Control  and  Nonlinear  Filtering  (U) 


12.  PERSONAL  AUTHOR(S) 

Dr.  J.  S.  Baras  and  Dr.  G.  L.  Blankenship 


13a.  TYPE  OF  REPORT  11 3b  TIME  COVERED  1 14.  DATE  OF  REPORT  (Year,  Month.  OayJ  ITS.  PAGE  COUNT 

Final  FROM  6/1/86  to  8/30/81  87  DEC  31  112 


1 7. _ COSATi  CODES _  18.  SUBJECT  TERMS  (Continue  on  reverse  it  necessary  and  identify  by  block  number) 

FIELD  I  GROUP  I  SUB-GROUP 


19  ABSTRACT  (Continue  on  reverse  if  necessary  and  identify  by  block  number) 


The  focus  of  this  report  is  on  advanced  tools  for  the  analysis  of  nonlinear  stochastic 
control  and  filtering  systems 

In  Sections  1  and  2,  we  present  a  series  of  results  on  the  analysis  of  certain  classes  of 
nonlinear  filtering  problems  using  comparatively  simple  bounding  techniques.  We  consider 
both  problems  with  small  noise  (large  signal  to  noise  ratios)  and  weakly  nonlinear  systems. 
We  show  that  the  optimal  nonlinear  filters  can  be  well  approximated  by  linear  filters  which 
are  very  easy  to  implement.  Moreover,  we  provide  sharp  estimates  of  the  degree  of  sub¬ 
optimality  involved  In  using  the  linear  approximating  filters. 

In  Section  3,  we  consider  the  problem  of  managing  the  estimation  of  (nonlinear)  diffusion 
process  by  a  system  employing  several  sensors.  The  essential  problem  is  to  ‘schedule"  the 
use  of  the  sensor  to  optimize  the  estimate  of  a  function  of  the  state  of  the  diffusion- — 


20  DISTRIBUTION /AVAILABILITY  OF  ABSTRACT 
©UNCLASSIFIED/UNLIMITED  □  SAME  AS  RPT 


22a  NAME  OF  RESPONSIBLE  INDIVIDUAL 
Patsy  Ashe 


DO  Form  1473.  JUN  86 


121.  ABSTRACT  SECURITY  CLASSIFICATION 

UNCLASSIFIED 


22b  TELEPHONE  (Include  Area  Code)  22c.  OFFICE  SYMBOL 


Previous  editions  are  obsolete. 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 

UNCLASSIFIED 


Block  19.  Abstract  (continued) 


process.  The  solution  is  obtained  in  terms  of  a  system  of  quasi-var iational  inequalities 
in  the  space  of  solutions  of  certain  Zakai  equations. 

In  Section  4,  we  provide  a  new  proof  of  the  minimum  principle  in  stochastic  optimal 
control  theory  for  systems  of  partially  observed  diffusions.  In  Section  5,  we  pro¬ 
vide  a  concise  analysis  of  the  ^conditional  adjoint  process*  arising  in  the  stochastic 
minimum  principle  for  partially  observed  diffusion  processes. 


